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APPENDIX A 


A MODEL OF ENVIRONMENTAL CONTROL SYSTEMS 

In 1973 L. D. Maxim and D. E. Cullen of MATHEMATICA, Inc. presented at 
the 44th National ORSA meeting in San Diego a paper entitled i' "A Model of 
Environmental Control Systems" (19). The article was ^ 1 ^:itten in anticipation 
of using the model, or a similar model, for developing an optimal inspection 
policy involving one or more of satellite, airplane, or surface inspections. 
The paper effectively illustrates the dependency of the optimal strategy on 
inspection costs, misclassification probabilities, and violation frequencies. 
This model was used as a base for the more elaborate model given in Volume I. 

I 

The unabridged article is given in this appendix. 



- 1 - 


A MODEL OF ENVmONMENT CONTROL SYSTEMS 
Environmental Impact Study 

An Environmental Problem 

Environmental quality is today a major public policy issue which' 
subsumes a complex of technical, economic, political, social, institu- 
tional and legal considerations. Broadly speaking, the goal of an en- 
vironmental policy is the maintenance of the natural environment in a 
status which combines aesthetic values consistent with economic pro- 
ductivity in man’s exploitation of his physical surroundings. But even 
if the goal were generally acceptable, the interpretation would vary widely 
between and among the numerous public and private interest groups in 
this country. .In order to implement whatever interpretation was agreed 
upon, two activities must take place: environmental modeling and en- 
vironmental control. That is, it is necessary to understand first the 
physical processes which determine the state of the environment, and 
secondly, the properties of alternative environmental control mechanisms. 

This paper considers an environmental problem of national 
interest. Several bills have been introduced in Congress to regulate 
or abolish strip mining. The Hays bill, (Rep. Wayne Hays, D. , Ohio), 
for example, would prohibit most such mining on slopes exceeding 20^. 
This bill was passed by the house in 1972 and has been reintroduced in 
1973 . The Hechler bill (Rep. Ken Hechler, D. , W. Va. ) would pro- 
hibit all surface mining within 6 to 18 months of passage. One state 
which would be greatly affected by any of these proposed laws is the 
Commonwealth of Kentucky. 
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Coal, particularly bitunainous coal, is the major sotirce of 

energy consmnption in the United States today and is likely to re^ 

main so for many years to come. Kentucky was the largest producer 

of coal in the United States in 1972.—'^ The Cominonwealth ranks 

third in recoverable bituminous coal reserves and also has large 

reserves of low sulphur coal. It is estimated that perhaps 70% of the 

2 / 

total coal reserves in Eastern Kentucky are of this grade.— Increas- 
ing national concern over atmospheric polKition has led many munici- 
palities and cities to require low sulphur fuels for power generation. 

It is therefore not surprising that mining is a highly important 
industry in Kentucky. This importance is particularly pronounced in 
regions such as the Appalachian area of Eastern Kentucky where coal 
mining ranks fourth in terms of total employment, accounting for 21% 
of the workforce and 27% of total wages.— In 4 counties of the region, 
coal mining accounted for over 50% of the work force. Secondary and 
indirect employment in other industries, including services, trans- 

V 

portation, trades and, to some extent, government, is higlily dependent 
upon mining. 

The public's attitude toward this industry is mixed because various 
forms of .pollution and environmental consequences have attended coal 
production: sedimentation, slope failure, chemical pollution, revege- 

tation difficulties, and aesthetic disturbance being major consequences 
of mining. There has been concern over these problems, and vigorous 
protest has been registered by environmental action groups, national 
newspapers, and many government agencies and coal associations. 

Surface mining has been ilie chief focvis of controversy because 
of its high visibility. More and more people are questioning 
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the economic priorities of the past. Surface mines (strip and auger) 
accounted for 56% of total Kentucky coal production in 1971, up from 
about 39% in 1966.— This mirrors a similar but somewhat less dramatic 
national trend toward surface mining. Reasons for this trend are not 
hard to find. National labor productivity of underground coal mines 
in 1967, for example, was only 15, 07 tons per man-day relative to 35. 17 
tons per man-day and 46, 48 tons per man-day for strip and auger mines 
respectively.—^ 

Thus, the yn’omise and problems of surface mining have come 
sharply into focus. To strike a balance between economics, energy 
and environment is the central question facing both coal producing states 
and the nation at large. Such a balance will require an appropriate com- 
bination of legislation to control bad practices, of research and develop- 
ment to provide technology and improved operating practices, and of 
‘'enlightened self-interest" on the part of mine operators. Many of the 
surface mines experience a precarious existence which has precluded 
investment in research and development. Longitudinal studies over the 
period 1961-1962 suggest that perhaps 60% of the firms in Eastern 
Kentucky failed to survive this two year interval.—^ Since these companies 
often operate on small profit margins, they also are not likely to be 
motivated to employ conservation practices which add to their costs but 
not to the price of coal. It has thus become the role of the state to 
enforce standards of operation upon the companies. 

The Commonwealth of Kentucky has imposed several laws to 
reduce environmental disruption by surface mining. Historically, 
the laws have been enforced by inspectors who periodically visit each 
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mine. Advances in aerial photography, however, now facilitate 

aerial inspection for detection of slides, revegetation failures, un- 

10 / 

authorized mining operations and other prohibited activities. — Satellites, 
aircraft, or a combination of both may be utilized in a multi-tier 
system. Areas failing inspection are checked-by ground inspectors. 

The following alternatives are considered in this paper: 

‘ 1 . Ground inspection 

2. Satellite and ground inspection 

3. Aircraft and ground inspection 

4. Satellite, aircraft, and ground inspection 

The objective of the work is to provide a framework for finding the most 
cost-effective inspection system and associated parameters. 

2. Model Development 

Let us assume that in an area to be inspected there are "N" 
sites, at which coal is being surface mined. Prohibited activities are 
occurring at of these sites, while at the others, the prescribed 

regulations are being met. The exact number and locations are, of 
course, unknown to the state authorities. The Commonwealth is 
responsible for insuring that proper mining practices are maintained 
and, therefore, needs to know the lowest cost way of performing the 
investigating activity. 

It is assiuned for illustrative purposes that if a man inspects a 
site, he will always make a correct determination of whether or not illegal 
activities are occurring. The cost of manned inspection, however, is high. 
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If inspection via remote sensing is 'used, either by spacecraft or aircraft, 
the possibility of misclassification arises. In this case there exist four 
possible outcomes: 

( 1 ) a site which is being mined by abusive practices is so 
identified (a 'Taad" site is correctly identified); 

(2) a site which is being mined according to proper standards 
is so identified (a "good" site is correctly identified); 

(3) a "bad" site is classified as "good"; and 

(4) a "good" site is classified as "bad". 

The last two possibilities are known as the beta”3" and alpha "o:*' errors 
respectively and are indigenous to any decision where there is less than 
complete information. 

Figure 1 depicts the structures of inspection systems having 
a one-tier and a two-tier structure. The present ground inspection 
system has a one-tier structure. In this case all sites, whether problem 
areas or not, are surveyed by inspectors and consequently, are all 
correctly classified. Both the spacecraft/ground and aircraft/ground 
inspection systems have a two-tier structure. In these cases, however, 
there is a probability "g" that a decision rule depending on an aerial 
inspection will judge a problem area as a no-problem area, and a 
probability "o;" th3.t the rule will judge a no-problem area as a 
problem area. When we need to refer to quantities such as the mis- 
classification probabilities in relation to either the satellite system or the 
aircraft system, we will subscript the quantities with an "s" or "a". 
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NOTE; Nodes represented by triangles indicate that no further 


inspection is conducted. Nodes represented by squares 
indicate manual inspection is conducted. The expected 
number of inspections in each category is enclosed in 
each bos. 

Figure 1: Structures for Model Development, 

The One and Two-Tiei* Inspection 
System 










respectively. Thus, for example, or^ will he the CL error associated 
with the satellite inspection. 

We will assume that a man is sent to check whenever the aerial 
inspection judges an area to be a problem area. A consequence of the 
alpha-error is that unnecessary manned inspection will occur (N-Nj^) a. 
times. The impact of the beta error is that problem areas will 

go undetected. Both kinds of site misclassifications introduce associated 
cost penalties, the total magnitude of which is controllable through the 
alpha and beta risks of the remote sensing systems. The actual values 
of the error probabilities depend upon the technical characteristics of 
the remote 'sensing system and, consequently, the technology that is 
available. In the limit we might theoretically design and implement a 
remote sensing system that, like the ground inspection system, is 
subject to no errors and which would eliminate the O: and f3 risks. Of 
course, the decision to implement such a system would depend on the 
costs, both non-recurring and recurring, that would have to be paid for 
such a system. 

The structure of a three-tier inspection system is shown in 
Figure 2. The satellite/aircraft/ground inspection system has this 
structure. A decision rule provides that aircraft will be called in only 
after the spacecraft has classified an area as a problem area. The 
expected number of problem areas judged as no-problem areas will be 

( 1 ) 
( 2 ) 
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NOTE: Nodes represented by triangles indicate that no further 

inspection is conducted. Nodes represented by squares 
indicate manual inspection is conducted. The expected 
number of inspections in each category is enclosed in 
each box. 


Figure 2: Structure of a Three-Tier 
Inspection System 
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no pi’oblem ai’eas will be misclassified as problem areas. Analogous 
to the two-tier systems, the attending misclassification penalty costs 
are unnecessary aircraft and manned inspection costs and the cost of 
tandetected problem areas. 

From the decision models shown, in Figures 1 and 2, the cost 
functions presented in Figure 3 may be derived. It was assumed that 
satellite inspection represents a fixed cost if used; the incremental 
costs are regarded as zero. The aircraft inspection costs, as shown 
in Figxire 3, may be derived directly from the structures in Figure 1 
and 2 , and as shown, depend on the decision model (i. e. , whether or 
not aircraft inspection follows a determination by satellite inspection 
that there is a problem area). The number of "tiers, " or combinations 
of inspection schemes, are provided for in the cost model by the binary 
variables X and X , their values depending on whet-,er or not space- 

S ct 

craft and aircraft systems are being used, respectively. If spacecraft 

are used, for example, then X would be one. If spacecraft are not 

used, then X would be zero, 
s 

The third cost factor, "false negatives, " indicates the social 

cost of a beta-type error. It includes the social and economic cost of 

nondetection and, by implication, the non- correction of a problem area. 

Some of the cost of nondetection results from the probability of physical 

damage, the value of which can be estimated. Other costs, however, 

are for non-market goods and activities. The values of these goods are 
* 

difficult to determine and could theorectically range from zero to in- 
finity depending upon the imputation of the social costs incurred due to 
misclassification. • ' 
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Cost Factor 



1. Satellite Inspection 


C X 
s s 


3. False Negatives 


4. Manual Inspection 


2. Aircraft Inspection X^(l-Xg) f N 1 + X^X^ f +Nj(l-p^) 




+ X^(1-X^) ^(N-Nj)a^ +Nj(l-3^) 


+ X X f (N-N, ) (X a • + N, (1- 3 )(1- 3 ) 
sal las 1 a s 


WHERE 


Cm " cost/ site inspected manually. 


Cg = cost of satellite inspection. 


= cost/site inspected by aircraft. 


Cp = cost/problem area not detected. 




= probability ’’good” area is mis- 
classified as problem area. 

= probability problem area is mis- 
classified as good. 

= integer variables to denote whether 
satellite or aircraft inspection is used. 


Figure 3: Composite Cost Function for Inspection Policies 




In Figure 4, several sets- of assumed values are given to the 
parameters discussed so far, and the alternative inspection policies 
are compared depending upon the values of the parameters. Policy 1 
(Pi) assumes a man-only investigation and, therefore, with an assumed 
price of $50 per site, and a thousand sites, there is an invariant cost 
of $50,000 to investigate all of the locations. Policy 2 (P 2 ) assumes 
that ground investigation occurs only after it is determined by satellite 
that a site is a problem area. Policy 3 (P^) assumes that ground in- 
spection occurs only after it is determined by aircraft that a site is a 
problem area. Policy 4 (P^) assumes that men are called in to investi- 
gate only after it is determined both by satellite and aircraft that a site 
is a problem area. 

In Figure 4, the cost of implementing the four inspection plans 
are given under conditions of relatively high and low alpha and beta 
risks for aircraft and spacecraft. Holding all other parameters con- 
stant, it is seen that the costs, and consequently the choices, of the 
alternative inspection policies are very sensitive to the alpha and beta 
risks associated with aircraft and spacecraft. When the alpha risk is 
relatively high (20% as compared with 10%), then an increased cost 
would be incurred for re-inspecting sites which are, in fact, not problem 
areas. Also, there is a high likelihood of incurring the social cost of 
not detecting problem sites when the beta risk is relatively high. The 
asterisks in Figure 4 identify the optimal policies in each case. It is 
seen that even if the alpha and beta risks are relatively high, the three - 
tier and two -tier inspection systems are economically preferred over 
manual inspection only. The model demonstrates that remote sensing 
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DIRECT COMMON COST FACTORS PER SITE 


Ground (men) 

Satellite 

Aircraft 

Cost of Misclassification 
by P - type error 

C = 50 

C = 200 

C = 15 

C = 2000 

m 

s 

a 

P 



Input Factors 
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($1,000) 
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No. . 

N 

«1 

O' 

s 

p a. 
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Pa 

Pi 

Pe 

P3 

P4 

1 
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10 
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25. 

5. 
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50 

15. 5 
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11.3* 
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1000 

50 
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25. 

5-. 
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50 
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34. 5 ‘ 

41. 9 

■3 
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5. 
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4 
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10 
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5. 

■ 5. ' 

50 

7.6 

* ^ 
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5 
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50 
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10. 

5. 

5. 

50 

❖ 

17.2 

24.7 

19.1 

6 
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10. 

5. 

5. 

50 

. 

a. 

29.2“ 

31.9 

36.4 
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Co 3 ts of Optimal Choices are Denoted with an Asterisk('!0 



O' 



= Total Number of Sites 
= Number of Defective Sites 
= Rate of Occurrence of O'- Type Errors 
= Rate jof Occurrence of Type Errors 
= ( ) for aircraft 
= ( ) for satellite 


Pj = Ground (men) only 

1?2 - Satellite + Ground 

P^ = Aircraft + Ground 

P^ = Satellite + Aircraft + Ground 


Figvire 4: Illusti*ative Results With Simple Survey Model 
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systems can be useful even though they may be inaccurate. Whether 

I ' : i 

1 ; i . * 

this is itrue, in fact, will depend upon the particular application of the 
model and the inputs appropriate to the application. In general, if the 
number of problem areas, the social cost of misclassification, or the 
ft and p errors are high, the optimal policy is ground inspection only. 

This results from the expectation of incurring substantial social costs 
for undetected problem sites. When the alpha and beta risks are 
relatively low and equal for aircraft and spacecraft systems, the policy P 2 , 
a spacefraft/ground system, is preferred. This results, from the 
fact that the spacecraft system costs are less than the aircraft system 


costs. • 

iFigure 5 maps other information about the systems onto a graph 
in which the horizontal axis represents the parameter Nj^, the number 
of defective areas, and the vertical axis represents the total cost of the 
alternaUve inspection programs. The value of the parameters other 
than N,, are given in the top-half of Figure 4 in runs 1 through 3. The 
efficiency frontier that has been drawn indicates the lowest cost strategy 
as a function of the number of defective areas in the actual population. 

Any policy other than the one indicated for a given value of N, is 

i . . 

inefficient from an economic standpoint. At values of N, less than 

i 

15, the three-tier plan, is the most cost-effective approach. 

Above that, up to about 39 defective areas, th,e man/ spacecraft approach 
is the most cost-effective, from 40 to approximately 95, the aircraft/man 
plan is preferred and above 95, a man-only plan is the cost-effective 
approach. The shape of the efficiency frontier depends upon the value 


of the parameters. At the limiting case of C equal to infinity where no 



Three-Tier Approach 


; Space craft /Man Approach 


Aircraft/Man Approach 
Man-Only Approach 


Efficiency Frontier 
(locus of Optimal Solutions) 


Nj (Number Defective Areas) 


Figure 5: Efficiency F rentier For Environmental 
Inspection Policies 
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beta risks are tolerated, either a-man-only system or an enhanced 
remote sensing system will be chosen, assuming that the technology 
is available to reduce 3 or 3 to zero. The choice would depend 
upon the relative costs of these systems. 

We have seen that a simple 'model can be used to assess the 
economic impact of an important technical characteristic of remote 
sensing systems, the system accuracy on the selection of a cost- 
effective system. Another technical aspect of the remote sensing 
system which influences the choice of the most cost-effective inspection 
mode is that of system availability. This system chai*acteristic is in- 
fluenced by many factors, some of which are related to the system 
design and some of which are exogenous to the system, such as weather 
conditions. The potential impact of system availability oh the choice 
of the economically optimum inspection mode can be deteiunined by our 
model as is illustrated in Figure 6, a sample computer output. These 
results are based on the parameters used in run 5, shown in Figure 4. 

!, The corner points of the cost grid map represent the four basic 
inspection alternatives tmder the assumption that the remote sensing 
systems are either never used or always used. For example, the man - 
only inspection system, having a cost of $50, 000, is represented by the 
grid point (aircraft, satellite) = (0,0). In contrast, the two-tier in- 
spection system, which calls for manual inspection of only those sites 
that have been classified as bad by a satellite, has a cost of $17, 200 and 
is represented by the grid point (aircraft, satellite) = (0, 1). By in- 
spection of the corner points, one can readily verify that the two -tier 
satellite/man policy is the cheapest strategy of the four basic alter- 
natives. Suppose, however, that we now consider the qviestion of 
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Figure 6. Solutions for Partial Availabilities of Aircraft and 
Satellite Systams 


satellite availability. If, for some reason, the satellite is available 
for site inspection only 80% of the time , then the actual cost of the 
satellite/man policy is not $17,200 but rather $23,760, corresponding 
to the grid point (aircraft, satellite) = (0, 0. 80). The fact that the 
satellite is unavailable for some fraction of the site inspections mark-- 
edly changes the cost of this policy and may render it a cost ineffective 
choice of inspection mode. For the data given in Figure 6, for example, 
the satellite/man inspection policy is cost effective only if satellite 
availability exceeds 90%. If the availability of the satellite is below 
this level, then the optimum inspection policy is the three -tier policy, 
and this remains true regardless of the availability of the aircraft remote 
sensing system. However, as is evident from the cost grid, the cost 
of implementing the three-tier system will not be $19, 175’ (as indicated 
by the grid point (aircraft, satellite) = (1, 1)) but instead will depend 
upon the availability actually achieved by the satellite and aircraft sensing 
systems. The cost model presented in Figure 4 allows for explicit 
consideration and evaluation of this primary technical system character- 
istic. 

Figure 7 contains the result of a sensitivity analysis for run 2 
of Figure 4 to explore the parameter ranges over which policy 
is optimal. For each parameter it shows the lower and upper limits 
and the policies which become optimal beyond the intervals. For 
example, the ground inspection cost can vary over a wide range from 
$35 to $203. Policy P^ requires more ground inspections and, conse- 
quently, benefits more from a lower inspection cost. Conversely, policy 
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Figure 7; Illustrative Sensitivity Analysis 

- 19 - 


























































reqtiires fewer men and suffers less from an increased cost. The 

satellite cost presents a different situation. Reducing the cost helps 

and P . but since at-most only $200 can be saved, it is not 

sufficient to make either of these policies optimal. Pj^ and P^ are 

not dependent on the satellite cost so there is no change in their 

relative status and we see that P^ is optimal over the full range of 

C . A similar review can be made for each of the other parameters, 

- s , - . . 

showing when and why each range limit and policy shift occurs. 


3, Variations in Errors 

Most systems can be altered so as to increase tt-type errors 
while decreasing 3 -type errors or vice versa. In this system, the 
errors arise, from mis classification. Changing the acceptance standards 
corresponds'to changing the a and 3 errors. H^-pothetical tradeoff 
curves for a , 3 and a , 3 are shown in Figure 8 , 



.Figure 8 : Hypothetical Tradeoff Curves for 

Satellite and Aircraft O'-t^pe Errors 
Versus 3-type Errors 
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They are hyperbolic curves with the property that reducing one probability 
by a factor of 50% results in doubling the other probability. The particular 
choice of this function is for illustrative purposes only- -empirical tradeoff 
curves must be determined for each application. 

This information combined with the earlier derived cost equations 
allows us to determine the optimal values of o> and $ to be used and 
consequently how to establish optimal acceptance criteria for the aircraft 
and satellite inspections. As an exainple, consider the cost expression 
for the satellite and man inspection system: 


C +’C N,B +C [(N-NJa +N.(l-8 )’] (3) 

s pis m*-' 1' s 1' s •' ' ' 

2 

Using O' 8 = (10%) and rearranging terms, this expr- s sion becomes: 

s s 

(G +C N,) + (C -C )N,8 + (10%)^C (N-N,)/S (4) 

' s ml p m 1 s m 1 ' s ' 


The optimal value of 8^ is obtained by setting to zero the derivative of 
’this expression with respect to 8^. 

(C -C )N- - (10%)^ C (N-N.)/,8^ = 0 (5) 

p m 1 m' I s 

Solving for $ yields the optimal value, denoted 8 : 

S'.-; . S 

8" = 10% (C (N-N,))^^^ ({C -C ■ (6) 

s m 1 p m 1 ' ' 
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The corresponding value of is: 


O''" = 10%((C - C )N,)^'^^(C (N-N,)) 

s P m 1 m 1 


,- 1/2 


(7) 


If these expressions result in either O' or /5 being greater than one, 
then the correct solution is obtained by setting that probability to 100% 
and the other to 1%. . 

If C is less than C , the expressions for cv and j3 become 
p m 

imaginary. This occurs if the penalty cost is less than the cost of manual 

inspection. In this case, $' = 100%, o' = 1% is the optimal solution. 

s s 

A corresponding result for the aircraft and ground system can be 
obtained. In this case the expression for the cost is 


(C N + C N,) + (C -C )N,0 + (5%)^ C (N-N,}/6 (8) 

a ml p m 1 a m l a ' ' 


The optimal values are: 


/S* = 5% (C (N-N,))^^^ ((C -C 

'^a m 1' P m 1' 


a*=5%((C-C )N,)^''^(C 

a ' p m 1 m 1 


(9) 

( 10 ) 


It will be noted that the expressions are the same except for the leading 
coefficients which are the square root of the constant term in the tradeoff 


curve. 


This observation is a specified case of the general conclusion that 

for any two tier system, if the Ct and jS type errors are related by the 

2 

tradeoff curve of? = T , then the optimal values are given by: 


0,3 


(ID- 


8* = T(C^(N-N^))^/^' ((Cp-C^)Nj)"^^^ 

= ( 12 ) 

If these expressions result in either O'’' or 6“' being greater than one, 

then the correct optimal solution is obtained by setting that probability 

-2 

to 100% and the other to T . K C is less than C , the optimal 

p m ^ 

;Jc * 

solution is obtained by setting P = 100% and a = T” . 

Using the values of the parameters given in Figure 4, the . 

'I' ’1' ’ 

optimal values for a. and 3 for both satellite /ground and aircraft/ 

ground systems are shown in Figure 9. 

A similar analysis can be conducted for tlie three tier system. 

In this case a pair of simultaneous nonlinear equations is obtained 

which can be reduced to a single fourth order equation. The various 

cases resulting from the several roots of the equation and the inter- 

i 

actions with the boundary conditions are too complex for presentation 
here but are obtained in a straight-forward manner. 

Generally, the value of T can be decreased by the expenditure 
of more money. Increasing the time per aircraft inspection, for example, 
might produce such an improvement. Note that for the two tier system, 
the change in a '' and p " is proportional to T. This is shown in Figure 10. 
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Ground (men) 


Direct common cost factors per site: 
Satellite Aircraft Cost of IV. 


Cost of Misclassification; 
by P-type Errors 


Input F actors 


H “ 


ptima 


Satellite /Grnd. \ Aircrft/Grnd 


eauccion in 
Cost (%) 


Po 


3 1000 100 


4 1000 10 


5 1000 50 


6 1000 100 


22. 4 


8- 7 46. 547 


6. 276 


20. 817 


0! 

a 

Pa 

Pi 

la 

P3 

, Pe 

5.436 13. 798 

50 . 

14. 6 

20. 9 

6. 8 . 

12.408 

6. 045 

50 

33. 1 

29.3 

9.5 

18. 028 4. 160 

50 

47. 1 

36. 2 

25. 1 

3. 138 - 7. 962 

50 

6.9 

18. 6 

9.0 

7.164 3.490 

50 

. 

16. 3 

24. 3 

5.2, 

10. 408 12.402 

50 , 

23. 9 

29 . 4 

18.0 


Figure 9. Optimal a and 8 Type Errors 


These values of Tg and correspond to those implicit in run 1 through 6 

of Figure 4. 



































































ap = T 
o;3 = T' 


Figure 10. Change in optimal error terms 
for change in technological 
capability. 

The cost of decreasing T generally rises nonlinearly as T approaches 
0, Hypothetical cost curves for T are shown in Figure 11. Because 
changes in T often result from improvements in the technology used, 
these are known as technological cost curves. 


Cost per inspection 


Cost per inspection 


i : ! 



->T 


Satellite/ Ground 
System 


Aircraft/ Ground 
System 


Figure 11. Flypothetical Technological Cost Curves 
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The cost of the satellite inspection system was given above as: 


(C +C NJ+(G-C )N, + C (N-N,)Qf 

's ml p m I s m' 1' 


( 13 ) 


where the optimal values of p and a are given by: 

s s 


1 / 2 , 




.- 1/2 


(14) 


. 1 / 2 . 


“s = ^3 ((C -C^)N^)--(C^(N-N^)) 


- 1/2 


(15) 


Substituting and combining like terms yields the expression: 


(C^+C N ) h 2T {CAC ..C ) (N-N ) N )^' 

s m i s m p m i i 


1 /*> 


(16) 


In order to find the optimal value of T, we add the cost of technological 
improvement from Figure 11: 


C N(T - 1) 

n s ’ 


(17) 


The sum of these two terms is then differentiated with respect to 


T yielding: 
s 


2(C (C -C ) (N-N.) = C NT 

m p ni 11 u s 


(18) 
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^OPDGffiiLiTY OF TO 

is i?00!1.- 


The optimal value of T , denoted T , is thus: 

s s 

t" = (C N)^^^(2C (C -C ) (N-N.)N,)"^'^^ 
s u ' m p . m 1' 1' 

A similar analysis for the aircraft/ground system yields: 

T = (C N)^^^(2C (C -C ) (N-N,)N,)"^^'^ 
a V m pm' 11 

The same approach may be used for the three tier system but is too 
complex for presentation here. 

Using the data presented in Figure 4, the following selection of 
optimal T, a, and $ values can be derived as shown in Figure 12 for 
the two two-tier systems. . 

4. Conclusions 

It is anticipated that a model such as we have described can be 
very useful in determining the optimal strategy for alternative remote 
sensing systems since it incorporates cost, technology characteristics, 
econometric estimation, and public policy. The description given is for 
general model and individual specifications, of course, must be tailored 
to the application or case study to be investigated. As seen, the model 
is simple and yet elegant and powerful. The alpha and beta risks are 
technical questions and, therefore, allow us to parameterize the quality 
or accuracy of alternative remote sensing systems. In addition, the 
model allows us to parameterize the operational availability achieved 
by the remote sensing systems and exainine the cost impact of this 


( 19 ) 


( 20 ) 



llS^OPllGIBiLJTy OF THE 

l-AGF IS POOR . 

Direct common cost factors per site; 

Ground (men) Satellite Aircraft Cost of Misclassification 

by 0 type error 

C =50 C =200 C =15 C =2000 

m 3 a p , 


Run 

No. 

Input 
F actors 

(Jntima) Parameters CVn) 

Cost ot Survey 
($1,000) 

Reductions in Cost/ Tabic =!= 

Runs 1,2,3 Runs 4. 5, 6 

Satellite /Groxmd 

Aircrsft/C-round 

N N j 

T 0! 0 

s s s 

T a 3 

a a a 

^1 

^2 

^3 

^2 

^3 

^2 

^3 

A 

1000 10 

17. 9407 11.2604 28. 5841 

08.9704 05.6302 14.2921 

50 

11.8 

21. 1 

44. 9% 

19. 4% 

52. 4% 

25. 0% 

B 

lOOO 50 

■ 

12. 1220 17.3672 08.4609 

06.0610 08. 6836 04. 2305 

50 

19.2 

25.7 

52, 1% 

25. 5% 

44. 0% 

23.8% 

C 

1000 100 

10.3321 21.5080 04. 9634 

05.1661 10. 7541 02.4817 

50 

24.6 

29.7 

54.6% 

28.5% 

41.4% 

23. 6% 


* After adjustment for technological development cost not included there. 


Figure 12. Optimal Technological and Error Parameters 


Satellite Aircraft 

Technological Technological 
Cost Cost 


C = 2 
u 


C =.50 
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important system characteristic. There are several relevant directions 
for further model development. that are readily apparent: 

o introduction of a larger set of classification outcomes 
(i. e. , "fuzzy results) 

o multiple inspection objectives 

o realistic cost functions for inspection techniques 

(e. g. , fixed cost aspects) 

o dependence of alpha and beta errors upon the magnitude of 
a problem area 

o more realistic ti'adeoffs between ^ and jS errors 

o budget constraints on inspection policies 

o more complex inspection policies (e.g.>, using random 
inspection of sites classified as no problem). 

The potential of each of these factors to sharpen the analysts of, 
and thereby enhance, the study results may be determined by extending 
this model. As an illustration, a more complex ground inspection cost 
function is modelled in the Appendix, (At ) . 

Wfe wish to emphasize the important lessons that can be gleaned 
from this illustrative model: 

1. simple models lend insight to the investigative process. 

2. as our model has demonstrated, a satellite can be a cost- 
effective component of an information retrieval system 
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even though it may not he the most accurate and/or 
reliable component of the system. 

model results can lead to profound changes in current 
eco^systeins information retrieval and control practices 


APPENDIX a' 

An Alternative Cost Model 

The fundamental model can be extended in several ways to 
improve its accuracy. One such improvement can be made in representing 
the ground inspection costs. The agency responsible for inspection in 
general cannot alter its staff at will. It will in fact hire a number of 
inspectors for this purpose with the consequence that the cost of this 
staff will be fixed. To handle any additional inspections above what the 
staff can normally handle, the inspectors may be asked to work overtime 
and employees in other areas may be utilized under a part-time, temporary 
arrangement. 

The cost relationships of this model can be defined in terms of 
the following parameters: 

M - the number of inspectors hired on a 
permanent basis 

6 - the number of inspections that can be 
conducted per inspector 

y - the cost per inspector incurred in 
one period 

y'- the cost per inspection for additional 
inspections above those that can be 
performed by the permanent staff. 

If n inspections are required in a period, the cost is either TM or 
y M + V^(n - 6 M) depending on whether n is less than or greater than 
9 M, respectively. Mathematically this can be expressed as; 

personnel cost = TM + T' lv^x[0,n - ^ M] 

This is shown in Figure A-1. 
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Labor 

Cost 



Figure A- 1. Cost Model for Manual Inspectio 


I 



In a two tier system, either aircraft and ground or satellite 
and ground, the number of inspections required is the sum of two 
quantities, the number of problem areas which are recognized as 
such, denoted n^^ and the number of non-problems which are identified 
as problem areas, denoted Both Uj and n^ are independent 

binornially distributed, random variables. Referring to Figure (A- 2) 
the expected values of n^ and n^ are Nj(l-3) and (N-Nj^) o: respec- 
tively. The respective variances are Nj (1-3)3 and (N-N^^) a (l-a). The 
CL and 3 errors are those associated with whichever two-tier system is 
under consideration. In practical applications, we may expect that the 
number of problem areas is small and that most inspections are con- 
ducted for non-problem areas. In this case, n^^ can be disregarded. 

For a large number of required inspections, the normal distribution 

■ 7 / 

provides a satisfactory approximation to the binomial distribution. Hald's— 

inequality, np (l-p)>9, provides a definition of the acceptable range for 

the approximation. For ^ 2 * this, is 

(N-Nj) a(l-Q:)> 9. . (Al) 

Since (1-a) may be assumed to be greater than 0. 5, the approximation 
will be valid for 

(N-Nj)q; >18. ■ (A2) 

The expression on the left, of course, is simply the expected number of 
required inspections. The use of the normal approximation permits us to 
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develop an analytic expression for the total cost and to find the size 
of the permanent staff which minimizes this quantity. 

The total cost, including the personnel cost^ is: 


T = C + (N-n ) C 
s ip 


(A3) 


+ V M + V Min [0, n- 9 M] 


The expected value of the total cost is: 


E[T] = + 3 N, Cp 


(A4) 


+ vM + V J (n-0M) p{n) dn 

9m 


where p(n) represents a normal distribution with the parameters 

2 

p, = a(N-Nj) and a = Ct (l-o:) (N-Nj^). The integral in this expression, 

known as the "partial expectation" does not have a closed form ex- 

‘ 8 / 

pression. It is tabulated in such sources as Brown,—' in Table D. 6. 


For the values of 9M in the range between the mean and the 

of. 

mean plus two times the standard deviation, Parker's-^ service function 
approximation may be used. Mathematically this gives: 


if 


then 


p, < 9M ^ p. + 2a 


to 


J (n- 9M) p (n)dn = 0. 45cr exp(-(0M-p)/.'6Oa) (AS) 
9M 
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Solving for the optimal value of M, say M , we find: 

m" 60cr 072(. 60 y /(. 45 0))] (A9) 

Substituting for g and CT yields: 

M"' = 0“^[,ct(N"N^) - . 60 (a(l-a) 60y/(. 45y'0))] (AlO) 

If the number of problem areas identified as such, n^, is not a 
negligible quantity, a different approach is required. Lie t us suppose 
that the expected value of n^ also exceeds 18 so that the normal 
approximation can be used. Then since n^ and n^ are normally 
distributed, so is tlieir sum n. The parameters of the three distri- 
butions are given in Figure A-2. 





"1 

^2 

mean 

(1-3) 

(N-Nj) a. 

variance 

N^3(l-P) 

. (N-Nj)a(l-cy:) 


n=n, +n 


N, (1-P)+ (N-N,) a 


FigureA-2. Parameters of Distributions 

The preceding derivation is unchanged except for the substitution 
for 4 - and or in the expression for M . The result in this case is 

M" = [Nj(l-P) + (N-N^) a - 

.60 (Njp(l~G )+ (N-N^) 6oy/(.45y'e))] (i 

To illustrate the use of this formula, the two runs with Nj^= 50 
in Table 4 have been recalculated. The expected values of and 
n- in this case are given in Figure A-3 . All are sufficiently greater 


E[n^]=N^(l-P) 


37.5 


45.0 


42.5 


47.5 


EK>(N-N,) 


Figure A-3. Expected Mis classifications for 
Sleeted Error Levels 


































than 18 so that a normal approximation to both . n^ and is 

acceptable. ' 

Each inspector can conduct 25 inspections in one period (6-25). 

The cost per inspector per period is the product of the earlier cost per 
inspection, and this quantity (v= 1250). The incremental cost 

per insjjection, assuming that these are performed on overtime, may be 

r 

taken as 150% of C (v =75). 

m • 

The optimal staff of either two tier system based on these data 
is given by: 

M"= . 04 fi + .00282679 CT (A12) 

In general, the value of M will be non-integer and must be rounded 
either up or down. In the results shown in Table A -4, both rounded 
values were checked in each case in the formula for the expected cost. 

Some values are out of range of the Parker approximation but not so far 
that a correct choice cannot be made. The optimization for the ground 
system naust, in general, be checked in the sam,e way, but in tiiis example 
the optimal value happens to be integer. 

It is noteworthy that the values of the survey costs for ^^^^d 

are not significantly changed from those reported in Table 4. 

Partly this is due to the fact that satellite, aircraft and penalty factors 
are the dominant contributors to the cost. This also indicates that the 
simpler model is fairly accurate and that consequently this refinement 
may not be needed in many applications. 


Direct common cost factor per site 
Satellite Aircraft 


C =200 
s 


C =15 
a 


Cost of Mis classification by 
3 type errors 

C =2000 
P 


Ground cost model factors 


Inspection per 
inspector per 
period 

e = 25 


Cost per inspector 
per period 

y = 1250 


Incremental cost 
per inspection 

7* = 75 


Run 

Input Factors j 

No. 

N Nj 

0! 3 
s s 

a ■ 3 

a a 

A 

1000 50 

20% 25% 

05% 15% 

B 

1000 50 

10% 10% 

05% 05% 


Optim.al Staff 


onnd {Satellite / 
Ground 





(Appx. Cost Survey($1000i 


?2 Ps p 

1 CC 

50 37. 0 I 35. 0 ■ p 

i' 


50 17. 8 25.: 
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APPENDIX B 


PREIoIMINARY ANALYSIS OF THE CURRENT 
INSPECTION PROCESS 

In 1975, as part of fhis project MATHEMATICA [ 3] analyzed 
inspection reports for the 1971-74 interim for strip-mining permit 
areas in Western Kentucky, Not all inspections were included in inspec- 
tion reports. For this reason, the total number of violation in the tables 
is low. We have assumed throughout this report that the frequencies 
of reported violations per inspection reports are not significantly different 
from the unknown frequencies of detected violations per inspection. 
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Preliminary Analysis of the- Current Inspection Process 

The current Commonwealth of Kentucky strip mine inspection procedures 
call for a strip coal mine to be inspected once every two weeks. In the 
past three years there have been 24 inspectors assigned to the Western 
Kentucky Madisonville office to inspect about a hundred permitted mines 
which operate in this region. Currently there are 11 inspectors with each 
inspector assigned to about 11 mines. 

From the computerized summaries of the 2760 mine inspection reports 
for Western Kentucky for the year 1971-1974 we have obtained the following 
information which characterizes the Inspection situation. 

1 . Inspection Frequencies 

Table 1 following shows data on the number of inspections by month 
by year for 1971-1974. Shown also in Table 1 are statistics on the number 
of active mines, tonnage, average weeks/mine inspection and M tons/ 
inspection. ' 



} As can be seen, the average interval between inspections (calculated [ 

1 on the basis of a 50 week year) is significantly greater than the ! 

} target value of two (by factor of roughly 3). j 




TABLE 1 


NUMBER OF INSPECTIONS BY MONTH AND YEAR 

Year Total 


Month 

1971 

1972 

1973 

1974 

1975 

1971-74 

Average 

January 

49 

63 

56 

55 

126 

223 

55.8 

February 

35 

59 

41 

55 

115 

190 

47.5 

March 

53 

82 

55 

63 

- 

253 

63.3 . 

April 

36 

77 

38 

56 

- 

207 

51.8 

May 

63 

75 

46 

59 

' - 

243 

60.8^ 

Jvme 

97 

89 

35 

58 

- 

279 

69.8 

July 

74 

59 

27 

52 

- 

212 

53.0 

August 

82 

70 

. 51 

56 

- 

259 

64. 8 

Septemher 

75 

57 

42 

75- 

- 

249 

62.3 

October 

44 

40 

51 

68 

- 

203 

50. 8 

November 

62 

38 

46 

76 

- 

222 

55. 5 

December 

• 

51 

36 ■ 

37 

96 

- 

220 

55.0 

Total 

721 

745 

525 

769 

241 

2760 


Average 


62. 1 

43.8 

64. 1 

120. 5 

57. 5 


Numb e r Mine s 

85 

71 

• 55 

90 




No-. Inspections/ 
• Mine Week 

.170 

.210 

. 191 

. 171 

/Calculated on basis of 1 
(50 -week operating year/ 

Weeks /Mine 
Inspection 

5.89 

4. 77 

5.24 

5. 85 

- 



MM Tons Pro- 
educed* 

31. 786 

33. 645 

31.337 

■ 

28.953 




M Tons /Inspec- 
tion 

44. 09 

45. 16 

59.69 

37.650 





=5=Source; U. S. Bureau of Mines 













It is also of interest to analyze this data to determine relevant time 
trends and/or seasonal variatiop.s. Shorn in the margins of Table 1 are 
row and column totals and appropriate mean values. Table 2 shows the 
complete analysis of variance of the data sho\m in Table 1. This analysis 
suggests the following conclusions: 


T- 
• I 
I 
I 
I 


(i) there is no significant month to month variation in 
inspection frequency. 

(ii) year to year variations are significant at the .05 level. 

Nineteen seventy- three had a significantly lower inspection 
count than the other years. It appears that inspection 
frequency is keyed to the number of mines. 


2. Relationship Between Violations and Inspections 

When an inspection of a mine is performed, a violation (,an "incident") 
may be reported in one of three broad categories: Method of Operation, 

Water Quality, or Revegetation. Each one of these main categories has 
several subcategories which are listed in Appendix B’ . If this notice does 
not work, then as a last resort the State Department for Natural Resources 
and Environmental Protection in Frankfort may issue an order of "suspension" 
and request that the miner appear at hearings, at which time a spectrum of 
actions may be taken ranging from lifting the Suspension to fines and 
revocation of the permit. 

In Western Kentucky the following pattern of "incidents, ""non-compliances 
and "suspensions" existed for the years 1971-1974. 


3 

A 


TABLE 2 

ANALYSIS OF VAI^NCE FOR INSPECTION FREQUENCY DATA 


SOURCE 

SUM OF SQUARES 

Row Means 
(Monthly Variation) 

1,948 

Colurnn Means 
(Yearly Variation) 

3, 176.25 

Residual 

8, 079, 75 

. , j 

! 

TOTAL > 

' 

13,204 


DEGREES OF 
FREEDOM 



MEAN 
SQUARE ■ 

F RATIO 

CRITICAL F 
APPROXIlvIATE 
@ 95% 

177.09 

. 723 

2. 13 

1658.75 

4. 32 

2. 9-2 


244. 84 


Sura of Squares Computations (Illustrate d) 


(i) Row Means 


(ii)’ Column 
Means 


12 ^ 12 


= 1,948. 


( 275 ^ 


= 3, 176.25 


(iii) Total 


49^ + 63^ + 56^ + 55 + 35^ .... 


. inm - 13 2 ( 

* 48 ” 


(iv) Residual *• By Difference 

























• 

19'71 

1972 

1973 

1974 

Incidents 

633 

659 

551 

1020 

Non-compliances 

35 

119 

164 

71 

Non-compliances /Incidents 

. 055 

. 181 

. 298 

.070 

Suspensions 

2 ■ 

1 

3 

1 

Suspensions /Non-compliances 

0.057 

0. 008 ’ 

0. 018 

0.014 


j I Thus, even thou gli the number of inspections has been reLatively | 

I : I . ■ I 

I constant from 197 1-1974 , clearly the number of reported violations • 

I increased in 1974. The ratio of non-compliances to incidents differs 1 

I 2 ' 

• significantly from year to year . (x = 179) \ 

L I 

While it is not possible to tell from this data^ 'interviews with the 
inspectors suggest that the reason for more incidents occurring is not 
that more violations are occurring but rather that the inspections have 
become more rigorous. On the other hand, the number of suspensions 
^ has remained small. This could possibly be due to the fact that once 
violations are detected they are corrected promptly. 

Further insight into the current process can be gained by an exami- 
nation of the relationships between violations detected and inspection 
frequency. If we let V be the true number of violations, p(D) be the 
proba'bility of detection, and v be the expected number of violations detected, 
it follows that 

V = p(D)V. (1) 

The detection probability is a function of both technical issues (e. g. ^ 
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measurement devices) and operational policies (e. g. , inspection frequency 
and thorouglmess). Though detection probabilities are, of course, a 
function of many variables, it is generally the case that these are nonlinear 
with inspection effort. To illustrate, suppose that in a mining operation a 
given violation is detectable only for a certain length of time r (measured 
in fractions of a month, for example). If inspections are conducted at 
random instants in time ‘ and if inspections are perfect (i. e, , will always 
detect a violation if in progress during the inspection), then it is easy to 
show that the single violation detection probability, p(D), is related to the 
monthly inspection frequency, n, by the following formula; 

p(D) = 1 - (1-r)^. (2) 

(In the above equation r can also be interpreted as the single inspection 
detection probability. ) Inspection equation (2) reveals several points: 
o when n = 0, p(D) = 0; 

e p(D), hence v, increases as n increases, but at a 

decreasing rate, asymptotically approaching 1 (or V). 

I 

Figure 1 shows actual data on detections and inspection frequency by month 
for the years 1971-^1974*. Detected violations by month by year are shown 
in Table 3A. (A more sophisticated approach would be to compute 
inspections /month/mine - but the point can be made in any event. ) Though 
substantial scatter exists, there is a clear relationship (significant at the 
99% level) betvv’’een violations detected and inspection frequency. This 
relationship mil later be used to compute "corrected” violation frequencies 

* Operational considerations may render truly x'andom inspections impossible 
or more costly than fixed or scheduled inspection policies. Other inspection 
policies have characteristics different (and in our view poorer) from random 
inspection. It is beyoxad the scope of this paper to elaborate on these differ- 
ences. 
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FIGURE 1. R.ELATION BETWEEN DETECTED 
VIOLATIONS AND INSPECTION FREQUENCY 


1971-1974 DATA 

o 
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NUMBER OF INSPECTIONS PER MONTH 



TABLE 3A. NUMBER OF 


Month 

1971 

1972 

1973 

Jamiary 

34 

70 

72 

Fehruary 

31 

50 

38 

March 

52 

88 

46 

April 

30 

76 

51 

May 

68 

70 

69 

Jame . 

85 

71 

•41 

July 

63 

47 

22 

August 

76 

64 

48 

Septernher 

77: 

40 

33 

October 

26 

38 

59 

November 

44 

22 

40 

December 

47 

23 

32 

Total 

633 

659 

551 

Average 

52.75 

54. 92 

45. 92 

St. Dev. 

20.61 

21.39 

14. 98 



TIONS BY MONTH AND YEAR 


t 


1974 

Total 

Average 

St. Dev, 

69 

245 

61.25 

18.21 

81 

200 

50. 00 

22.11 

106 

292 

73.00 

28. 77 

93 

250 

62. 50 

27.69 

82 

289 

72.25 

6.55 I 

? 

j 

74 

271 

67.75 

18. 82 1 

71 

203 

50. 75 

21.61 1 

73 

261 

65.25 

■! 

12. 58 . i 

106 

256 

64. 00 

34.01 j 

79 

202 

50. 50 

23.39 ' 

93 

199 

49. 75 

30.38 

93 

195 : 

48.75 

31. 12 


1020 
85. 00 
12. 98 


' . ■■ ■ . 


to adjust for changes in inspection frequency. Note that v docs not 
appear to be reaching an asymptotic value for the inspection frequencies 
(1971-1974) - this suggests that detection probabilities are significantly 
less than unity (though there are alternative explanations) . 

The actual counts of violations can be misleading if counts are 
misinterpreted as costs. This is because the counts of violations within 
any category depend on the refinement of violations listed under the 
category. For example, if vegetation violations were refined to twenty 
types of incidents (rather than the txTO types vegetation - current and 
vegetation - regulation used in this report) , then the total number of 
vegetation violations might be increased tenfold. The actual cost of the 
violations is, however, independent of the formulation of the list of vio- 
lations. A refined list of violations as used in this V7prking paper is very 
useful for analysis of trends and probabilities. However,' as done in this 
report, violations can be pooled into broad categories. The ultimate 

pooling is to use a single category in which a violation is defined to be 
\ 

one in which at least one incident occurs. Such a reduction of a multiple 
violation model to a single violation model is discussed in the working 
paper, "A Simplification of the Multiple Violation Model." Table 3 B 
reveals that on the average about 55% of the inspections result in an 
incident. This rate can be used in the cost model. 
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TABLE 3B. "BATTHsTG AVERAGE'* DATA 

1971 1972 1973 1974 TOTAL 


Number of Inspections with at 
Least One Incident 

399 

397 

274 

447 

1517- 

Total Number of Inspections 

721 

745 

625 

769 

2760 

Function of Inspections with at 
Least One Inspection 

0. 55 

0. 53 

0. 62 

0. 58 

0. 55 


The chi-square analysis given in Table 4 shows ths,t there are no 
significant year to year differences among the values in Table 3B. Again, 
the year to year or seasonal differences depend on the list of violations. 
Table 3B uses only one category (at least one incident) for. a violation. 
However, we will show that there are indeed both seasonal and yearly dif- 
ferences among the aggregate number of violations. For example, vege- 
tation incidents increased each year from 1 971 to 1974, with no vegetation 
incidents in 1971. The aggregate yearly difference could thus be made 
even more dramatic if vegetation incidents were counted in twenty different 
ways. These results show that it is necessary to consider individual 
violations when analyzing trends. As a point of interest, the incidents/ 
inspection figures are significantly higher for Western Kentucky than for 
Eastern Kentucky for 1972 (the only year for' which such comparisons can 
be made). Another point of interest is that the number of incidents may 
'represent not only "ground truth" but also changing standards in defining 
an incident (as reflected by the fact that no vegetation incidents were 
recorded in Western Kentucky in 1971). 



TABLE 4 


CONTINGENCY TABLE ANALYSIS 

Actual frequencies from data (f..) 

^1 


Year 

1971 


1973 

1974 

Total 

Inspection with 
at least one 
incident 

399 

397 

274 

447 

1517 

Incident free 
inspections 

322 

348 

251 

322 

1243 

Total 

721 

745 

525 

769 

2760 


Expected frequencies xmder null hypothesis (F..) 


Year 1971 

1972 

n 

1973 

mm 

1 

Total 

(721) (1517) 

= 396 

2760 

409 

289 

423 

1517 


325' 

336 

236 

346 

1243 

Total 

721 

■ 

745 

525 

769 

2760 


* I 

Chi“Square computation; 


4aluo - S S % - 

2 

X is less than expected value of 3 and is thus insignificant. 
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Yearly and Seasonal Trends 


1 

1 


1 


i 


An analysis of violation frequencies depends on the number of 
violations that ax*e defined. As explained in the appendix, violation types 
considered in this report are listed under the following four categories; 
method of operation, water quality, vegetation, and discrepancies. 
Generally, the same trends and conclusions as given in this report will 
result for any sufficient refinement of violations where each violation has 
weights representative of the 12 types. 

We first analyze the yearly trend for the aggregate violation types. 
The chi-square analysis for the aggregate number of incidents is given by 
Table 5. Because of the large number of detected incidents in 1974, the 
cbi-square statistic has a very significant value of 96. 3. A plausible 
explanation for the increase in 1974 has been given in the previous section. 


TABLE 5 

YEARLY ANALYSIS FOR AGGREGATE INCIDENTS 

1971 1972 1973 1974 Total 


Observed Incidents 

633 

659 

551 

1 

1020 

2863 

Expected Incidents 

747.9 

772.8 

554. 6 

797. 9 

2863 

Ratio of Observed to 
Expected 

0. 85 

0.85 

1.01 

1.28 


Chi-square 

. ■ j 

17. 65 

16.76 

0.08 

61.8 

96.3 






















Probably of more importance than a yearly trend is the seasonal 
trend. If seasonal trends exist, then adjustments in the inspection pro- 
cedures can be made to increase the probability of detecting costly 
violations. For this reason, we have given not only a gross seasonal 
analysis for the aggregate incidents, but also a refined breakdown of the 
seasonality trend for each of the four categories. 

We first examine the gross seasonality of aggregate violation 
counts. To do this we have calculated the relevant chi-square statistics 
as shown in Table 6A. The chi-square statistic was calculated under two 
different null hypotheses for violation counts. The first hypothesis is that 
violation counts are independent of either the, season or the number of 
inspections. The chi-square value of 62.8 is very significant and thus 
this hypothesis must be rejected. 

The second hypothesis adjusts the violation counts by the number of 
inspection counts. Under the second hypothesis, violation counts during 
any month are proportional to the number of inspections during that month 
but are independent of the month. The chi-square value of 27. 53 reveals 
that again the violation counts do follow a seasonal pattern, i. e. , the 
assumption of independence by month is invalid. This seasonal trend can 
be established by graphing the values of = f./F^, the rate of violation 
counts R to the average adjusted counts F^. Results of the analyses on 
Table 6 A are summarized by the following: 

OP THE 
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(i) Highly significant (0. 01 level) seasonality in aggregate 
violation counts exist. 

(ii) Months of greatest violation rates are January through 
May, with the maximuna peak in April, A lower peak 
is attained at a point in August through October, 

(iii) Months of lower violation rates bottom in July and 
December, with December having the lowest rate 

(iv) A summary of the seasonal adjusted factors for the 
12 months is shown below; 


Month 

% of Average 

Month 

% of Average 

January 

106 

July 

92 

February 

102 

August 

97 

March 

111 

September 

99 

April 

116 

October 

96 

May 

115 

November 

86 

Jiane 

94 

December 

85 


(v) A possible recommendation is that inspection frequencies 
be adjusted to reflect these seasonal differences. 




IF NUMBER OF INCIDENTS INDEPENDENT 
OF NUMBER OF INSPECTIONS 


IF NUMBER OF INCIDENTS ADJUSTED 
FOR NUMBER OF INSPECTIONS 


MONTH 

AGGREGATE NUMBER 
OF INCIDENTS OF ALL 
TYPES 

■ ■ f. -- 

i 

EXPECTED NUMBER 

UNDER NULL 

HYPOTHESIS 

F. 

1 

RATIO 

f./F. 

i i 

(f.-F.)^/F. 

i i i 

EXPECTED NUMBER 
UNDER NULL . 
HYPOTHESIS 
F. 

i 

RATIO 

f./F. 

(f.-F.^)/F. 

l V ' l 

JANUARY 

245. 

238. 58 

1. 027 

0. 17 

231.32 

1. 059 

0.81 

FEBRUARY 

200. 

238. 58 

0.838 

6. 24 

197.09 

1. 015 

0. 04 

MARCH 

292. 

• 

238.58 

1.224 

11.96, 

262.44 

1. 113 

3.33 

APRIL 

250. 

238. 58 

1.048 

0. 55 

214.73 . 

1. 164 

5. 79 

MAY 

289. 

238.58 

1.211 

10.65 

252.07 

1. 147 

5.41 

JUNE 

271. 

238.58 

1. 136 

4. 40 

289. 4i 

0. 936 

1. 17 

JULY' 

203. 

238.58 

0. 851 

5. 31 

219.91 

0. 923 

1. 30 

AUGUST 

261. 

238.58 

1.094 

2. 11 

268. 67 

0. 971 

0.22 

SEPTEMBER 

256. 

238.58 

, 1.073 

1.27 

258.29 

0.991 

0.02 

OCTOBER 

202 

238. 58 

0. 847 

5.61 

210.58 

0. 959 . 

0.35 

NOVEMBER 

199. 

238. 58 

0.834 

6. 57 ^ 

230.28 

0.864 

4.25 

DECEMBER 

195. 

238. 58 

- 

0.817 

7.96 

228.21 

0.854 

4. 83 

— 

TOTAL 

2863 


2 

X 

c'.'lc=‘2.S0 I 

2863 


27.53 















In Tables 6B, 6C, 6D, and 6E we give chi-square analysis similar 
to Table 6A for violations falling within single categories. While gross 
seasonal trends in Table 6A have been shown to be mathematically signif- 
icant, in the refined analysis only the method of operation category has 

significant monthly differences at level . 05. Vegetation is significant at 

2 

level . 10, while water quality is significant only at. level . 30 ()( = 19. 7, 

"2 2 

_ = 17.3, X or) = 12. 9). Hence, it is of importance to give plausible 
reasons for these trends in order to establish their validity. That is, a 
question that should be answered is whether a particular type of incident 
is more likely to occur during a particular time of the year. As an aid to 
such a diagnostic study, we have listed the number of violations by type 
which occurred each month in Table 6F. 


The chi-square statistic is used only for testing statistical signif- 
icance and can not be used for comparing categories' because the total 

number n of incidents falling within a category is not constant. Thus, 

2 

X = 73. 51 for method of operation is large, both because there probably 

is seasonal variation and because n= 1614 is large. For comparison 

2 

* among categories x should be used (a better statistic is the usual 

2 2 

measure of variation given by the mean square error s = S (f. - F) / (n - 1) ). 
Such a comparison shows vegetation has the largest seasonal variation and 
water quality has the least. Both water quality and vegetation incidents peak 
in the spring and in the fall while method of operation incidents are consis- 
tently above average during January through June and below average the 
remaining six months. 
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IF NUMBER OF INCIDENTS INDEPENDENT 
■ OF NUMBER OF INSPECTIONS 


IF NUMBER OF INCIDENTS ADJUSTED 
FOR NUMBER OF INSPECTIONS 


MONTH 

AGGREGATE NUMBER 
OF INCIDENTS OF ALL 
TYPES 

C 

X. 

1 

EXPECTED NUMBER 
UNDER NULL 
HYPOTHESIS 
’ ' F. 

i 

RATIO 

f./F. 

(f.-F.)^/F'.- 

'll' i 

JANUARY 

. 155. 

134.50 

1. 152 

3. 12 

FEBRUARY 

122. 

134. 50 

0. 907 

1.16 

MARCH 

185. 

’ 

134. 50 

1.375 

■ 18.96 

APRIL 

145. 

134.50 

1. 078 

0.82 

MAY 

• 157. 

J 134.50 

1. 167 

3.76 

JUNE 

164. 

134.50 

1.219 

6. 47 

JULY 

120. 

134.50 

0.892 

1. 56 

AUGUST 

144. 

134.50 

1.071 

0. 67 

SEPTEMBER 

138. 

134.50 

1. 026 

0.09 

OCTOBER 

85. 

134.50 

0. 632 

18. 22 

NOVEMBER 

94. 

134.50 

0. 6.99 

12.20 

DECEMBER 

105. V 

134.50 

0. 781 

6.47 


EXPECTED NUMBER 
UNDER N-'ILL 
HYPOTHESIS 
F. . 

i 

RATIO 

L/F. 

i i 1 

130.41 

1. 189 

4. 64 

111. 11 

1. 098 

1. 07 

. 147.95 

1.250 

9.28 

i 

121.05 ! 

^198 

4.74 

142.10 

1.105 

1. 56 

163.15 

1.005 : 

0.00 

123. 97 

0. 968 ' 

.0.13 

151.46 

0. 951 

0. 37 

145.61 

0. 948 

1 

•0.40 

118.71 

' 0.716 

9. 57 

129.82 

0.724 

9.88 

1 

128.65 

0.816 

4.35 


2 

calc 


= 73. 51 


X = 


TOTAL 


1614 


45.- 99 •> 
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TABLE 6C 

SEASONALITY ANALYSIS FOR WATER QUALITY 



MONTH 


JANUARY 

FEBRUARY 

MARCH 

APRIL 

MAY 

JUNE 

JULY 

AUGUST 

SEPTEMBER 

OCTOBER 

NOVEMBER 

DECEMBER 


TOTAL 


IF NUMBER OF INCIDENTS INDEPENDENT 
-OF NUMBER OF INSPECTIONS 

AGGREGATE NUMBER 
OF INCIDENTS OF ALL 
TYPES 

f. 

1 

EXPECTED NUMBER 
: UNDER NULL 
HYPOTHESIS 

F. 

i 

RATIO 

f./F. 

(f.-F.)^/F. 

i i i 

45. 

50.67 

0.888 

0. 63 

41. 

.50.67 

0. 809 

1. 84 

51. 

50.67 

1.007 

0.00 

53. 

50. 67 

1.046 

0. 11 

62. 

50.67 

1. 224 

2. 54 

49 

50. 67 

0. 967 

0. 05 

'34. 

50.67 

0,. 671 

5.48 

56. 

50.67 

1. 105 

0.56 

61. 

50.67 

1.204 

2. 11 

51. . 

50.67 

1.007 . 

0.00 

60. 

50.67 

1. 184 

•1.72 

45.,' 

50.67 

0.888 

0.63 

1 


IF NUMBER OF INCIDENTS ADJUSTED 
FOR NUMBER OF INSPECTIONS 


EXPECTED NUMBER 
UNDER NULL 
HYPOTHESIS ■ 


49. 12 
41. 86 
55. 73 
45. 60 

53. 53 
61.46 
46.70 
57.06' 

54. 85 
44.72 

. 48.90 
48. 46 


RATIO 


f./F. 

i- i 


0, 916 
0. 980 

0. 915 

1. 162 
1. 158 
0.797 

0. 728 
0. 982 

1 . 112 
1. 140 
1. 227 
0. 929 


0. 35 
0 . 02 
0.40 
1.20 
1 . 34 
2.53 
■ 3.45 
0.02 
0.69 
0.88 
2. 52 
0.25 



X" , = 15.68 
calc 
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MONTH 


J^^NUARY j 

FEBRUARY 

MARCH 

APRIL 

MAY 

JUNE 

JULY ' 

AUGUST 

SEPTEMBER 

OCTOBER 

NOVEMBER 

DECEMBER 


IF NUMBER OF INCIDENTS INDEPENDENT _ j| IF NUMBER OF INCIDENTS ADJUSTED 

OF NUMBER OF INSPECTIONS FOR NUMBER OF INSPECTIONS 


AGGREGATE NUMBER 
OF INCIDENTS OF ALL 
TYPES 

f. 

i 

EXPECTED NUMBER 
UNDER NULL 
HYPOTFIESIS 

RATIO 

f,/F. 

i i 

(f.-F.)^/F. 

i i' i 

EXPECTED NUMBER 
UNDER NULL 
HYPOTHESIS 

F. 

i 

RATIO 

f./F. 

i i 

{L-F.^)/F. 

i 1 i 

9. , 

16.83 

0. 535 

3. 65 

16.32 

0. 551 

3. 2S 

7.: 

16.83 - 

0.416 

5.74 

13. 91 

0. 503 

3.43 

21. 

16.83 

1 . 248 

1.03 

18.52 

1.134' 

0.33 

17. 

16.83 ■ 

1.010 

0.00 

15.15 

1. 122 

0.23 

21. 

16.83 

1 . 248 

1.03 

17.78 

1. 181 

d. 58 

16. 

16.83 

0. 950 

0.04 

20.- 42 

0.784 

. 0.96 

14. 

16.83 

0. 832 

0.48 

15.52 

0. 902 

0. 15 

17., . , • 

16.83 

1.010 

0.00 

• 18.96 

0.897 

0.20 

20. 

16.83 

1.188 

0.60 

18.22 ■ • 

1.097 ■ 

0. 17 

26. 

16.83 

1. 545 

«4. 99 

14.86 

1.750 

8.36 

16. 

■16.83 

0.950 

0. 04 

16.25 

0. 985 

0.00 

18. ‘ 

16. 83 ’ 

1. 069 

0. 08 

16. 10 

1. 118 

0.22 


TOTAL 


202 


calc 


= 17. 68 


= 17.92 
















IF NUMBER OF INCIDENTS INDEPENDENT 
OF NUMBER OF INSPECTIONS 

IF NUMBER OF INCIDENTS ADJUSTED 
FOR NUMBER OF INSPECTIONS 

MONTH 

AGGREGATE NUMBER 

OF INCIDENTS OF ALL 

TYPES 

f. 

1 

EXPECTED NUMBER 
UNDER NULL 
HYPOTHESIS 
F. 

i 

RATIO 

f./F. 

1 1 

(f.-F.)^/F.. 

i i' i 

EXPECTED NUMBER 

UNDER NULL 

HYPOTHESIS 

F. 

1 

RATIO 

f./F. 

i 1 

(f.-F.^)/F. 

i i i 

JANUARY 

36. 

36.58 

0. 984 

0. 01 

35.47 

1. 015 

0. 01 

FHBRUARY 

30., 

36. 58 

0. 820 

1. 18 

30. 22 

0. 993 

0. 00 

MARCH 

35. 

36.58 

0. 957 

0.07 

40.24 

0.870 

0. 68 

APRIL 

35v , ■ 

36.58 

- 

0. 957 

0. 07 

32. 92 

1. 063 

0.13 

MAY 

49. 

36.58 

1. 339 

4.21 

38. 65 

1.268 

2.77 

JUNE 

42. 

36.58 ■ 

1. 148 

0.80 

44.38 

0. 946 

0. 13 

JULY 

35 

36.58 

0. 957 

0. 07 

33.72 

1.038 

0.05 

AUGUST 

44. , • : , 

36. 58 

1. 203 

1.50 

41.20- 

1. 068 

0. 19 

SEPTEMBER 

37. 

36.58 

1 . on 

0.00 

39. 61 

0. 934 

0. 17 

OCTOBER 

40. 

36.58 

1. 093 

0.32 

32.29 

1.239 

1.84 

NOVEMBER 

29. 

36. 58 

0.793 

1.57 

35. 3l' 

0.821 

1. 13 

DECEMBER 

■ 27.' ■ . 

36.58 

0.738 

2.51 

34.99 

0. 772 

1.83 


TOTAL 


439 


calc 


= 12.33 


X = 8. 93 
















TABLE 6F 


NUMBER OF VIOLATIONS BY TYPE BY MONTH 
(1971-1974 RAW DATA) 


JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV- DEC 


METHOD OF 
OPERATION 

6 

5 

17 

11 

■ 

10 

■ 

7 

. 15 

13 

11 

15 

GRADING 

CURRENT 

105 

86 

• 

129 

98 

107 

114 

84 

102 

93 

54 

62 

65 

GRADING PLAN 

38 

26 

34 

32 

33 

32 

20 

25 

20 

17 

12 

17 

ACCESS ROAD 

6 

5 

5 

• 

4 

6 

8 

9 

10 

10 

1 

9 

8 

SILT STRUCTURE 

15 

16 

B 

19 

23 

16 

13 

20 

22 

19 

27 . 

19 

WATER QUALITY 
CHEMICAL 

6 

10 

9 

9 

7 

7 

. 

3 

9 

10 

6 

5 

4 

WATER QUALITY 
PHYSICAL 

6 

3 

1 

2 

3 

1 

1 

6 

7 

5 

6 

7 

DRAINAGE PLAN 

14 

12 

19 

18 

25 

17 

14 

17 

17 

17 

19 

11 

WATER 

IMPOUNDMENT 

4 

0 

1 

5 

4 

8 

3 

- 

4 


4 

3 

3 

VEGETATION 

REGULATION 

4 

4 

11 


6 

6 

5 

8 

8 

10 

4 

7 

VEGETATION 

CURRENT 

5 

3 

10 

9 

15 

10 

9 

9 

12 

16 


11 

DISCREPANCIES 

36 

30 

35 

35 

49 

42 

35 

44 

37 

40 

29 

27 

NON- VIOLATION 

93 

83 

B 

84 

103 

131 

108 

115 

116 

101 

113 

105 

1 












































































































































An analysis was made between the amount of monthly precipitation 

and the number of incidents to determine if such an association could account 

for a significant percentage of the seasonal variation. To explain this 

analysis, let S. denote the average rainfall for the i— ^ month and S denote 

the average monthly precipitation over all months. Then the rate of pre- 

th 

cipitation for the i— month above the average is defined by 

X. = S./S , . ' 

th 

Let y^ = denote the i— rate of incidents for a given category. If a 

linear relation exists between incidents and precipitation, then, except for 
random error, y. is given by 


y. = a + bx. 
'i 1 


The value of b is positive if the correlation is positive, negative 
if the correlation is negative, and insignificant if there is no significant 
correlation. The total seasonal variation for incidents is 


= 2 (y^ - y)^ 


The total seasonal variation for precipitation is 


S ^ = S (x. - X 

X ' 1 


(x = 1) 


The correlation R between x and y is defined by 


R = xy where S ^ S (x. ^ x)(y. - 

S • S ^ ^ ^ 

X y 
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The variation due to the linear relation between y and x is 



The percentage of seasonal variation accounted for by precipitation is 
2 

simply lOOR %. 

Below is listed the amount of precipitation in inches/month in 
Western Kentucky, averaged over the years 1931-55. 



JAN 

FEB 

MAR 

APR 

MAY 

JUN 

JUL 

AUG 

SEP 

OCT 

NOV 

DEC 

Precipitation 

5. 10 

-3. 69 

5. 31 



4. 09 

4. 17 

3. 55 

3. 10 

2. 50 

3. 35 

3. 92 


Bel 

0. 95 


1.36 

1.10 

0. 97 

1. 05 

1. 07 

0. 91 

0. 79 

0. 64 

0. 86 

1.00 


Table 6G summarizes the analysis of the correlation between the monthly 
precipitation values and the monthly rate of iiicidents by category. 

The only significant correlation that was found was in the method of oper- 
ation category. The estincated linear relation for this category is 

y. = . 29 b .70x, 

1 1 

Figure 2 illustrates the obvious correlation between method of operation 
and precipitation. 




























TABLE 6G 

STATISTICAL ANALYSIS OF CORRELATIONS 
between monthly PRECIPITATION RATES 
AND INCIDENT RATES 



METHOD or OPERATION 

WATER QUALITY 

VEGETATION 

Total Monthly 
2 

Variation (Sy ) . 

0. 355 

0.270 

1. 167 

Variation Accounted 
for by Precipitation 

2 2 ■ 
(Sy^ • E ) 

0.221" 

0. 070 

0. 283 

Correlation (R) 

. 789" 

-.510 

-. 492 

7 

1 r)T> 

F -Statistic (— 

1 - R 

16. 507"' 

3. 524 

3.201 


Significant at le vel .05 if F > 4. 96 

























4, Probabilities and Measures of Association Between Violations 

One of the topics ‘of this report is the frequency or probability (or 
marginal pr’obability) with which a violation or incident occurs. By the 
true probability of an incident we mean the fraction of days that a specified 
incident is expected to occur. Since this is unavailable, we estimate the 
probability from the 2760 inspections given in the inspection reports for 
the interim. 1971-74. A probability is estimated by the ratio of the total 
number of occurrences of a particular violation (at most one oia any given 
inspection) to the total number of inspections. The values of tliese 
probability estimates depend on whether the ratio is made by counting by 
a specific month, year, or by counting over all 2760 inspections. 

Tables 7A and 7B show the probabilities of occurrences of violations 
by category (each category counted at most once per inspection) by m.onth 
and by year. These tables again illustrate the yearly and seasonal trends 
analyzed in the previous section. A conclusion not arrived at previously 
is that the increase in violations in 1974 by category is due to the three 
categories: water quality, vegetation, and discrepancies. Method of 
operation violations actually decreased in 1974. 

Table 8 lists the probabilities of each of the twelve types of incidents, 
averaged over all of the inspections. The fact that there are more violations 
than inspections (1. 0366 = 2863/2760) is consistent with the fact that 
several violations occur simultaneously. 

The fact that some violations may be dependent on the occui'rcnce 
of other violations maybe an asset to aerial or satellite inspection. 

This is because it is possible that some violations may be easily detectable 
from the air or from satellites while others are not. There is less 
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TABLE 7A 


















































TABLE 7B 


PROBABILITY OF OCCURRENCES OF VIOLATIONS 
BY CATEGORY FOR YEARS 1971-1974 


YEAR 

METHOD OF 
OPERATION 

WATER 

QUALITY 

VEGETATION 

DIS- 

CREPANCIES 

1971 

. 5395 

. 0472 

• 0 

, 1123 

1972 

. . 4846 1 

. .1154 j 

. 0054 

. 0859 

1973 

.3429 

. 2324 

. 0590 

.1638 

1974 

. 3407 

. 3147 

.1313 

. 2705 

























TABLE 8 

VIOLATION FREQUENCIES BY TYPE 
TOTAL 1971-1974 


VIOLATION 

TYPE 

TOTAL 

VIOLATIONS 

1971-1974 

AVERAGE 
NUMBER PER 
INSPECTION 

RELATIVE 

RATE 

(GRADING=1) 

GRADING CURRENT , 

1099 

0. 3982 

1. 000 

DISCREPANCIES 

439 

0. 1591 

0.399 

GRADING TO PLAN_ 

306 

0. 1109 

0.278 

SILT STRUCTURE 

230 

0. 0823 

0. 209 

DRAINAGE PLAN 

200 

0.0725 

0. 182 

METHOD OF OPERATION 

128 

0. 0464 

0. 116 

VEGETATION CURRENT 

120 

0. 0435 

0. 109 

WATER QUALITY CHEMICAL 

85 

0. 0308 

0.077 

ACCESS ROAD 

81 

0. 0293 

0.074 

VEGETATION REGULATION 

81 

0. 0293 

0.074 

WATER QUALITY PHYSICAL 

48 

0.0174 

0.044 

WATER IMPOUNDMENT 

44 

0.0159 

0. 040 


2863 

1. 0366 











concern about missing a specific violation if there is a high probability 
of detecting a different highly correlated violation. Thus, if A and B 
are two types of mcidents which are highly correlated and A can be 
detected while B can not, then the inference that B has occurred might 
be made whenever A is detected. 

For the 12 types of incidents, there are 132(=144-12) conditional 
probabilities of one incident given another. Thus, for simplicity, the 

analysis of these 144 ordered pairs is better illustrated by analyzing a 
single pair. The follomng numerical example is taken from the 
12 X 12 matrices given in Tables 10 through 14. 

A Numerical Example 

In this section we provide a numerical example to illustrate the 
definition and computation of various quantities associated with the 
correlation among various violation types. 

The input data for all of these computations is illustrated by 
Table 9A below for two violation types - (i) method of operation and 

(ii) drainage plan. Referring to the table, we see that of the total of 
2760 inspections 34 resulted in both violation types being present, l66 
detected a drainage plan violation but no method of operation violation, 
etc. Shown to the right of each number is a symbol which will be used 

subsequently. Equivalently, we may convert the data of Table 9 A into a 
table of proportions which shows the probability of each of the various 
events of interest. Such a table is shown as Table 9B - to save space 
in this and further tables and discussions we define events A and B to 
represent method of operation and drainage plan violations respectively, 
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Table 9A. Association. Between 


Method of Operation and Drainage Violations 


Number o£ 
Inspections 
in which the 
Method of 
Operation 
Was ; 


Niimber of Inspections in which 
the Drainage Plan Was; 



In Violation 

Not In 
Violation 

•Total 

In Violation 

34 njj 

94 

■ 

128 n, 

x» 

Not In 
Violation 

166 »2i 

2466 1 I 22 

2632 n~ 

2* 

Total 

200 n - 

• 1 

2560 n ^ 

♦ z 

2760 n 

• • 


Table 9B. A Probability 
Matrix for Violation Types 




B • 

. B 

Total 

A 

0. 0 12 p j 

0.034 V 12 

0,046 p 

^ i* 

A 

0. 060 p^ ^ 

0. 894 x>22 

0.964 p^. 

Total 

0.072 p/j 

0. 928 p^ 2 

, 

1.000 1 


For reasons discussed in pi’evious working papers, it is important to 

examine the association between events A and B, Suppose, for example, 

that events A and B are statistically independent. In this case the 

expected number of inspections resulting in the event AD B would be 

128 

given by 200 • ~ 2iS0 ~ ~ 9. 3, considerably beneath the 34 cases actually 

observed. Similar computations for each of the other events results in 
the 3.n.atrix of values shown below; 
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Table 9C. Expected Frequency 
Under Null Hypothesis 

B B Total 

A 
A 

Total 

2 

To test the significance of these discrepancies we compute the X statistic 
shown below and compare it to the appropriate critical value : 

yZ . ^1/^11 ^22 ' ^12 ”~2ll ~ 

n, n„ n . n „ 

!• Z* * 1 • 2 

which for this example is, 

2760 (1 34-2466 ^ 166- 94 1 - T 2760)^ 

, X2 ^ ^ _J 2 71.53. 

128- 2632-200- 2560 


9.3 

118.7 . 

128 

, 

190.7 

2441.3 

2632 

200 

2560 

2760 


The critical value (at the 95% level) for this statistic is 3. 84, so the 
observed association is statistically highly significant - i, e , , method 
of operation and drainage plan violations are correlated. 

There are several ways in which this correlation can be estimated 
or illustrated. The first is by a measure termed (unfortvmately since 
the naaxie is not descriptive in this context) the relative risk or R sta- 
tistic. This statistic is defined below: 
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p(B|A) 

‘ p(b|a) 

and is the ratio of the conditional probabilities of, event B given that 
event A has occurred to that in the event A has not occurred. In terms 
of the symbols defined in Table 9A» R is given by, 


^11 

^ 21 . ^21 ^^ 2 * 166 • 128 
■^ 2 » 


4. 2 12 . 


In this case, drainage plan violations are 4, 2 times more likely to occur 
when the mine has a method of operation violation than would be the 
case if no method of operation violation were 'noted. If A and B were 
independent events, R should be equal to unity. 


Another descriptive measure of association is the so-called 
odds ratio. This is developed as follows; 

(i) A measure of the relative likelihood of experiencing an 
outcome ,B when event A has occurred is. 



p(B A) 
P(B A) 





hi 

^12 


In this example is 34/94 = 0. 3617, or in other words, 
for every inspection in which the drainage plan is in violation, 
there are about 1/0. 3617 = 2* 76 inspections when no 
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drainage plan violation occurs given that tlierc is a method 
of operation violation. 


(ii) When A is absent, the odds of B's occurrence are 
defined as » 

jj_ _ p(b|a) 
p(b|a) 


”2l''”2- _ "21 

"22^^2- ”22 


In this example is 166/2466 = 0,06732. 

(iii) The two odds and can be contrasted in a number 

. of ways to provide a measure of association. The odds 
ratio , (o , is currently in greatest use. co is defined as. 


6J = 


n 


A 


a- 


^11 ^ 22 
^12 ”21 


(34) (2466) 

— = 5. 373 , 

(166) (94) 


which, for this example, indicates that the odds of an 
inspection turning up a drainage plan violation are 5. 4 
times as likely if a method of operation violation occurs 
than if this is not the case. As was the case with the relative 
risk measure, the odds ratio is 1 if A and B are 


independent. 


Having defined and illustrated various statistical concepts relevant 
to detecting, testing, and estimating association between events, we now 
examine the full set of inspection data. 
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Table 10 lists the n-umber of times a violation A in a particular 
row occurs wth a violation B in a particular column. For each such 
pair of violations a table similar to Table 9A can be constructed. Note 
that of the values in Table 9 A, only = 34 can be found in Table 10. 

The row and cokxmn totals n^^ = 128 and n. = 200 are the marginal 
totals given in Table 8. Since n. , '= 2760 is known, all other values 
in Table 9A can be found by -subtraction. 

The probabilities P(BlA) and P(BlA) illustrated in Table 9B are 
given in Tables llA and llB. Large values of P(BlA) and small values 
of P(BiA) are ideal when the occurrence of A is used to identify B. The 
worst case is when P(BlA) = P(BIA) (if P(BlA) > P(B|.A) then the non- 
occurrence of A can be used to predict B). The chi-square statistics 
for the A.vo- sided tests of P(BlA) = P(BlA) are given in Table 12. Of 
the 66 unordered pairs off the diagonal, 29 chi-square statistics were 

significant at level . 05. Of these 29 cases, 26 were significant at 
level .01. Thus, correlation among violations is widespread. 

‘ Tables 13 and 14 list only those relative risks and odds ratios 
for pairs'that are significantly dependent. A quick overview of these 
tables shows that the violation types within each category tend to be more 
closely associated with each other than with violation types outside of 
the category . The two types most closely associated with each other are 
vegetation- current and vegetation- regulation. Discrepancies are 
associated with every type. This is not surprising since the detection 
of any violation type increases the probability of a discrepancy. 

Through the use of either the odds ratios or the relative risks 
tables, one can determine for any violation type, that violation type 
which it is most closely associated with. Thus, the physical and chemical 
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water types are both, associated with silt structure. Note that although 
silt structure is associated with drainage plan, one can not draw the 
inference that drainage plan is associated with physical water quality. 

As in the seasonal trend analysis, it is important to make a diagnostic 
study to determine what, if any, causal relations exist among the pairs 
that are associated. The determination of a logical basis for an associa- 
tion supports the use of such associations for the detection of incidents. 

If on the other hand, associations are not necessary, such associations 
can not be guaranteed to exist in the future. This statement is particu- 
larly relevant if the correlation among violations is to be exploited for 
satellite detection purposes. If correlation is not intrinsic and miners can 
learn that these correlations are the "tip offs" or "signatures, " the miners 
can rectify operating procedures in the future so as to deny these signatutes 
In this case then, secrecy is essential. 

Table 15 lists the top ten pairs of violation types that are assoc- 
iated according to the odds rati.o measure of association. In six of the 
ten cases, a plausible explanation can be given for these associations. 

Four pairs where an association is not apparently necessary are; . 

vegetation current and water impoundment, silt structure and vegetation 
current, silt structure and access road, and silt structure and vegetation 
regulation. It should be recognized that the estimated odds ratios for some 
pairs of incidents may be either much higher or lower than the true values 
(such as vegetation current and water imp oxmdment) because the sample 
size is too small for an accurate estimate (vegetation current and water 
impoundment occurred together only twelve times). 


TABLE 10 


NUMBER OF OCCURRENCES OF TWO TYPES OF VIOLATIONS ON THE SAME INSPECTION ’ 
(DIAGONAL IS THE NUMBER OF VIOLATIONS BY TYPE) 



METHOD OF 
OPERATION 

GRADING 

CURRENT 

GRADING 
TO PLAN 

ACCESS 

ROAD 

SILT 

STRUCTURE 

WATER QUALITY 
CHEMICAL 

WATER QUALITY 
PHYSICAL 

W 

o 

t-Z 

Oft 

1 

WATER 

IMPOUNDMENT 

VEGETATION 

REGULATION 

VEGETATION 

CURRENT 

DISCREPANCIES 

METHOD OF 
OPERATION 

128 

87 

66 

.9 

23 

8 

1 

34 

3 

4 

6 

39 1 

Grading 

CURRENT 

87 

1099 

274 

53 

.90 

29 

15 

106 

28 

35 

62 

316 

GRADING TO PLAN 

66 

274 

306 

23 

34 

8 

4 

52 

11 

17 

19 

158 ‘ 

A CCESS ROA D 

9 

53 

23 

81 

20 

0 

4 

17 

2 

6 

10 

38 1 

t 

SILT STRUCTURE 

23 

90 

34 

20 

230 

17 

12 

73 

2 

20 

31 

95 

WATER QUALITY 
CHEMICAL 

8 

29 

8 . 

0 

17 

85 

4 

13 

1 • 

5 

8 

27 . [ 

V/'ATER QUALITY 
PHYSICAL 

1 

15 

4 

4 

12 

4 

■ 48 

6 

2 

- 1 

3 

t 

19 

DRAINAGE PLAN 

34 

106 

52 

17 

73 

13 

6 

200 

6 

6 

12 

86 

'WATER 

IMPOUNDMENT 

3 

28 

11 

2 

2 

1 

2 

6 

45 

4 

12 

1 

22 1 

VEGETATION 

REGULATION 

4 

35 

17 

6 

20 ■ 

5 

1 

6 

4 

82 

66 

i 

40 i 

VEGETATION 

CURRENT 

6 

62 

19 

10 

31 

8 

3 

12 

12 

66 

120 

I 

68 ■ 1 

i 

DISCREPANCIES 

39 

316 

158 

38 , 

95 

27 

19 

86 

22 

40 

68 

• ^ 

439 1 



IF TIMS VIOLiVTION OCCUIIRED ON A GIVEN INSPECTION 


• TABL-E llA 

MATIUX UP t;UNI)ri‘lt)NA I. PKUU'Al’.n.I'J’lES IMlijA) UI‘ A VIOLATION 
GIVEN ANOTHER VlOEATiON IS EETECTED 


HERE IS THE PROBABILITY THAT THIS VIOLATION ALSO OCCURRED 


B 

A 

METHOD OF 
OPERATION 

GRADING 

CURRENT 

GRADING 
TO PLAN • 

ACCESS 

ROAD 

SILT 

STRUCTURE 

WATER QUALITY 
CHEmCAL 

WATER QUALITY 
PHYSICAL 

DRAINAGE 

PLAN 

WATER 

IMPOUNDMENT 

VEGETATION 

REGULATION 

VEGETATION 

CURRENT 

DISCREPANCIES 

METHOD OF 
OPERATION 


0. 6797** 

0.5156** 

0.0703* 

0.179-f"* 

0.0625 

0. 0078 

0.2656** 

0. 0234 

0.0313 

0.0469 

0. 3047** 

GRADING 

CURRENT 

0.0792^'* 

. 

0.2493’"* 

0. 0482** 

0.0819 

0.0264 

0.0136 

0.0965** 

0. 0255** 

0.0318 

0. 0564** 

0.2875** 

GRADING TO PLAN. 

0.2157’"* 

0. 8954** 


0.0752’"* 

0. iin 

0.0261 

0.0131 

0. 1699** 

0. 0359** 

0.0556** 

0.0621 

0. 5163*,* 

ACCESS ROAD 

0. 1111* 

0. 6543** 

0. 2840** 


0.2469** 

0.0 

0.0494 

0. 2099’"* 

0. 0247 

0,0741* 

0. 1235** 

0.4691** 

SILT STRUCTURE ' 

0. 1000*’" 

0. 39 13 

0.1478 

0. 0870** 


0. 0739** 

0. 0522** 

0. 3174** 

0.0087 

0. 0870** 

0. 1348** 

0. 4130*” 

WATER QUALITY 
CHEMICAL 

P. 0941 

0.3412 

0.0941 

0.0 

0.200Cf* 


0.0471 

0. 1529** 

0,0118 

0.0588 

0.0941* 

0.3176** 

WATER QUALITY 
PHYSICAL 

0.0208 

0.3125 

0. 0833 

0.0833 

0.2500** 

0.0833 


0. 1250 

0.0417 ■ 

0.0206 

0.0625 

0.3958** 

DRAINAGE PLAN 

0. 1700** 

0.5300** 

0.2600** 

0.0850** 

o'. 3650** 

0.0650** 

0.0300 


0.0300 

0.0300 

0.0600 

6.4300’"* 

WATER 

IMPOUNDMENT 

0.0667 

0. 6222** 

0.2444** 

0. 0444 

0. 0444 

0.0222 

0. 0444 

0. 1333 


0.0889 

0. 2667’’'* 

0.4889'"'" 

VEGETATION 

REGULATION 

0.0488 

0,4268 

0,2073** 

0.0732* 

0. 2439** 

0.0610 

0. 0 122- 

0.0732 

0.0488 


0.8049*’^ 

0.4878** 

VEGETATION 

CUR.RENT 

0. 0500 

0.5167** 

0.1583 

0. 0833** 

0.2533** 

0. 0667* 

0. 0250 

6. 1000 

0. 1000** 

0. 5500** 


0. 5667** 

DISCREPANCIES 

0. 0888** 

0.7198** 

0.3599** 

0. 0866** 

0.2154’"* 

0.0615** 

0. 0433** 

0. 1959** 

0. 0501** 

0.0911** 

0. 1549** 

• 

MARGINAL 
PROBABILITY P(B) 

0. 0464 

0. 3982 

0. 1109 

0,0293 

0. 0823 

0. 0308 

0. 0 174 

0.0725 

0.0159 

0.0293 

0.0435 

0. 1591 


’‘■significantly DIFFERENT FROM P{B) AT SIGNIFICANCE LEVEL . 05. 
**S1GNIFICANTLY DIFFERENT FROM P(B) AT SIGNIFICA.NCE LEVEL .01* 


IF Tins VJ0U\T10.N* DID .':Or CCCCR ox A GIVXX IXSnOCTlOX 


TABLE IIB 


MATIUX'OF CONDITIONAL PKODAUI UTIES P(nlA) OF A VIOLATION 
GIVEN ANOTHER VIOL/\T:ON IS NOT DETECTED 

HERE tS THE PROBABi UTY THAT THIS VIOLA TI ON OCCURRED 
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u 


v: in > U 

r>, 

f— I 

o c. 

R « 

> U 
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METHOD OF 
OI’EHATION 


0. 3845** 

0.0912' 

0,0274* 

0.0786** 0.. 0293 

0.0179 

0.0631 

0.0160 

0. 0296 

0.0433 

0. 1520 

OILVDING 

CCRRrINT 

0, 0247** 


a 4t 

0.0193 

0.0169** 

0,0843 0.0337 

0.0199 

0.0566** 

0,0102** 

0.0283 

0.0349** 

0.0741 

grading TO plan 

0.0253** 

0.3362** 


0.0236** 

0.0799 0.0314 

0.0179 

0.0603** 

0.0139** 

0. 0265** 

0.0412 

0.1145** 

ACCESS ROAD 

0.0444* 

0.3904** 

0. 1056** 


0.0784** 0.0317 

0. 0164 

0.0683** 

0.0161 

0.0284* 

0.0411** 

0. 1497** 

silt structure 

0.0415** 

0.39S8 

0. 1075 

0.0241** 

0-0269** 

0.0142** 

0.0502** 

0.0170 

0.0245** 

0.0352** 

0. 1360** 

v.'ater quality 
CHEMICAL 

0.0449 

0.4000 

0. 1114 

0.0303 

0.0796** 

0.0164 

0.0699** 

0.0164 

0.Q2S8 

0.0419* 

0. 1540** 

WATER QUAUTY 










0.0431 


PHY'SrCAL 

0. 0408 

0.3997 

0. 1114 

0.0284 

0.0604** 0.0299 


0.0715 

0.0159 

0.0299 

0. 1549 

drainage plan 

0.0367** 

0.3879** 

0.0992** 

0.0250** 

0.0613** 0.0281** 

0.0164 

0 

.0152 

0.0297 

0.0422 

0. 1379** 

YVATER 

IMPOUNDMENT 

0.0460 

0.3945** 

0. 1037** 

0.0291 

0.0340 0,0309 

0.0169 

0.0715 


0.0287 

0.0398** 

0. 1536** 

vegetation 

regul^vtion 

0.0463 

0.3973 

0. 1079** 

0.0230* 

0.0784** 0,0299 

0.0176 

0.0724 

0.0153 


0.0202** 

0. 1490 ** 

vegetation 

CURRENT 

0. 0462 

0.3928 

0. 1087 

0.0269** 

0.0754** 0.0292* 

0.0170 

0.0712 

0.0125** 

0.0061** 


0. 1405** 

DLSC.REPANCIES 

0.0383** 

0.3374** 

0. 0633 

0.0185** 

0.0582** 0.0250** 

0,0125** 

0.0491** 

•* A 

0.0099 

0.0181** 

0. 0224** 


MARGINAL 
PkOnABtUTY P(D) 

0.0464 

0, 3982 

0. 1109 

0.0293 

0,0823 t,.o:-os 

0,0174 

0.0725 

0.0159 

0.0293 

0.0435 

0. 1591 


iilONIElCANTLY DIFFEREN’T FROM P(B) AT SIGN: FlrANCK I.EVKI. . 05, 
*»S:CN.UTCANTLY DIFFERENT FROM P(B) AT SION: FICANCK LEVEL .0! 




TABIJE. 12 

CHI-SQUARE VALUES FOR TESTING IF THE CONDITIONAL PROBABILITY P(B1A1 
DIFFERS SIGNIFICANTLY FROM THE MARGINAL PROBABIilTY P(B) 
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METHOD OF 
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VEGETATION 
REGULATION ‘ 
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49.47 

1159.28 
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DISCREPANCIES 

20.16 

223.76 

325.44 

57.62 

118.95 

15. 29 

18.71 

116. 17 

34. 74 

65.77 

152.66 



SIGNTITCANT AT LEVEL .05 IF > 3.84 
VERY SIGNIFICANT AT LEVEL . 0 1 IF X ^ > 6.63 


TABLE 13 . • 

ODDS RATIOS FOR MEASURING THE ASSOCIATION OF TWO VIODATION TYPES 
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SIGNIFICANT ONLY AT LEVEL .05 . 

OMITTED VALUES ARE INSIGNIFICANT 
UNASTERISKED VALUES ARE SIGNIFICANT AT LEVEL .01 
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TABI.E 14 

RELATIVE RISK OF THE OCCURRENCE OF VIOLATION TYPE B SPECIFIC TO THE OCCURRENCE OF TYPE A 
. HERE IS THE RELATIVE RISK OF THE OCCURRENCE OF THIS VIOLATION 
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METHOD OF 
OPERATION. 

GRADING 

CURRENT 

GRADING TO PLAN 

ACCESS ROAD 
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WATER QUALITY 
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DRAINAGE PLAN 
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"SIGNIFICANT ONLY AT LEVEL. 05 
OMITTED VALUES ARE INSIGNIFICANT 
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TABLE 15 



■ INTERACTION 

Vegetation Regulation and 

Vegetation Current 

Grading Current and 

Grading to Plan 

Grading to Plan and 

Method of Operation 

Silt Structure and Dirainage 
Plan 

Vegetation Current and 

Water Impoundment 

Drainage Plan and 

Method of Operation 

Silt Structure and 

Vegetation Current 

Silt Structure and 

Access Road 

Silt Structure and Physical 
V7 ate r Quality 

Silt Stx'ucture and Vegetation 
Regulation 

- 84 - 


5 , Conclusions 


In this working paper we have given a statistical analysis to 
show that seasonal and yearly differences do exist for the frequencies 
with which violation types occur. We have also shown that significant 
dependencies exist among violation types. These results should be 
useful in developing future inspection procedures. 

Soine furtlier statistical analyses can be done with the data from 
the 2760 inspections. For example, we have showii that the number of 
violations by month and year is linearly related to the number of inspec- 
tions during the same months and years. We have also shown that the 
number of inspections shows no significant seasonal trend but does have 
a significant dependency on years. Since the number of mines does 
change from, year to year, analyses should be made which examine the 
relationship of inspections per year per mine and violations per 
inspections per mine by month and year. Also, a probability model 
should be developed which assumes the true number of violations at 
, any time is a variable which, increases with the number of mines in 
operation. Because mines have variable capacities, a second approach 
would be to replace the number of mines with the number of tons of coal 
produced. Such analyses can be made by the method of analysis of 
covariance. 

Some of the analyses given in this report will be included as 
input for the cost/effectiveness models of mining inspections by 
satellites with follow-up ground or aircraft inspections. 
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APPENDIX B’ . 


Violation Categories 

The eighteen (18) violations on the Data Entry Form were grouped 
into twelve (12) violation types. Types v.^ere formed not only on the 
basis of the relationships amhng violations but also on the basis of 
assumptions about what Landsat could or could not "see." Violation 
types were then aggregated into four (4) very broad violation categories 
based upon the relationships among the violation ty’-pes. 

To explain the group process let us examine the violations listed 
under "Surface Water," Jackson Turbidity Units (JTU) , etc., all relate to 
the quality of the water discharged from the mining site. These violations 
could be grouped into a single violation types. However, to do so would be 
to ignore the fact that there are two very different components of water quality 
which can be resolved from the data contained on the Data Entry Form. These 
•components are chemical water quality and physical water quality. 

Iron concentration, pH, acidity’-, and alkalinity are measures of 
chemical water quality. We have used only pH and (Fe) since pH and acidity^ 
and alkalinity are, to a certain extent, redundant. Another reason for 
excluding acidity and alkalinity is that there was ambiguity concerning the 
tests for these parameters. 

JTUs and the presence of settleable matter are measures of physical 
water quality. Both parameters were used. 

By grouping the violations under "Surface Water" in this X'jay, the 
inspection data could be used to determine the frequency’- of occurrence 
of these two (2) violation types and also the frequency of joint occurrence 
with each other and with othe.r violation types. The frequency of joint 
occurrence is important since it is believed that for Landsat, physical 
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water pollution inay be "visible" while chemical water quality may be 
"invisible. " (In fact, the existence of the former may mask the 
presence of tire latter. ) .However, rising the frequency of joint occurrence, 
it seemed possible to make inferences as to tlie existence of "invisible" 
chemical water quality violations based on "visible" physical water quality 
violations. 

Similar reasoning was used in forming other violation types. By 
grouping violations in this way it was hoped tliat it v/ould be possible to 
enhance Landsat’s capability to detect "invisible" violations by detecting 
jointly occurring "visible" violations. Another advantage of groupingthe 
violations into types was that it reduced the number of variables which 
were manipulated in the statistical analysis of the inspection data. 

The violation t^qpes were further aggregated into four (4) broad cate- 
gories (see Table B*2). These categories were based upon the relationships 
among the violation types. The water quality category, for example, 
includes not only the chemical and physical violation t)?pe.s but also the silt 
structures and drainage plan violation types. These were included in this 

category since properly designed, constructed and maintained structures 

# ' 

are required foi* water treatment. The water impoundment violation type 
was also included in this category since unauthorized impoundments 
were believed to be likely sources of chemically polluted water. 
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TABLE B'l . 

ILLUST1U.TIVE DATA ENTRY FORM WITH DEFINITIONS 


V10IJ\.T10M TYPE 


access road 


GRADING CURIVENT 


1 J GRiVDING TO PLAN 

VEGETATION REGULATION 


'!i-CO>S>LIANCE^:^ , ' 
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n II 


f I 
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NOT A VlOl^MTON 
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TABLE B’2 


VIOLATION TYPE BY .VIQLATION CATEGOR.TES , 


VIOLATION CATEGORY 


METHOD OF OPERATION 


VEGETATION 


WATER QUALITY 


DISCREPANCIES 


VIOLATION TYPE 


ACCESS ROAD 
GRADING CURRENT 
GRADING TO PLAN 
METHOD OF OPER/vTION 


VEGETATION REGULATION 
VEGETATION CURRENT 


DRAINAGE PLAN 
SILT STRUCTURE 
WATER QUALITY CHEMICAL 
WATER. QU.ALIT Y PH YSIGA.L 
WATER IMPOUNDMENT 


DISCREPANCIES 
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rORMEFiUy WIUl.OW RUN LADOHATORILS. THE IJNIVtRSny or MICMIC.AN 


Computer Processing Operations 
for Kentucky Stripmine Land sat Project 

The flow of processing operations is shovm in Figure 1. The be- 
ginning point for analysis was Landsat-2 CCT data in 7 track 800 bpi 
format. The first step in processing was to examine the data using 
the LIGIiALS software package on the University of Michigan Amdahl 
470V computer (ref 1 and 2). From this examination, a qualitative im- 
. pression of data quality was obtained, level assignments determined 
for graymaps to be produced later, and dark levels in each band 
determined for later processing. 

For tlie 30 October 1975 data (scene 2231-15465), the quality of 
the data in MSS channels 6 and 7 was very good. MSS-4 data had a 
pronounced striping pattern every sixth line. Some slight striping 
also existed in MSS-5. Prints of these 4-bands are included in Appendix D 

The dark level correction mentioned is an attempt to account for 
the additive effects of atmospheric conditions by determining what 
this factor is in each channel and subtracting it. In the absence of 
instrumentation to measure this vje determined the lowest signal in 
each channel in an area where low reflecting objects ("blackbodies") 
occurred. Since the signal from a blackbody would be zero if there 
were no path radiance, we assumed that the difference between the 
signal we received from our approximations to blackbodies and zero 
was a measure of the path radiance. For the 30 October 1975 data, 
the values we determined for MSS-4 thru MSS-7 vjere 8, 5, 1, and 0, 
respectively. 

The next processing steps were format conversion from Landsat 
format to a format compatible with ERIM computers, follov.ied by im- 
plementation of the dark level correction and then data rotation and 
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scaling. The rotation and scaling is needed to reduce effects of 
earth rotation during the time if takes to scan a scene, to rotate 
the data so scan lines run east-wt", c, and to adjust the number of 
data points by nearest -neighbor interpolation so that computer line 

printer maps would have a scale 1:24,000. I 

Following rotation and scaling,- the data V70re edited to the study • i 

area through specification of the vertices of the study area (in ’ [ 

Landsat line and pixel coordinates). A separate tape of the study i 

f 

area data was made. Then four ratio channels (MSS-5/MSS-4 , MSS-5/ ! 

MSS-6, MSS-7/MSS-6, MSS-7/MSS^5) were added to the four Landsat bands f 

through further computer processing. At this time, graymaps (scale j‘ 

1:24,000) were prepared of four Landsat bands and four ratios. These f 

graymaps constituted one output product, (Appendix D). ? 

Jiased on the ground and aiiciaft photography and the ultimate 
terrain classification categories desired, we selected several areas [ 

t I* 

representing different types or conditions of materials to use as ? 

training sets. These areas were carefully located on an MSS-5 gray- | 

map (we attempted to avoid mixture or boundary pixels) , and signatures 
were extracted using the STAT. program. Each muXtispectral signature I 

is a statistical description of a group of data points (pixels). It f 

contains the mean value of the signal in each channel and the covariance f 

matrix, from which the standard deviation of the signal and the correla- | 

tion between each pair of channels may be calculated. Each signature 

was derived Using 8 channels: the foul- MSS channels and the ratios j 

MSS-5/MSS-4, MSS-5/MSS-6, MSS-7/MSS-6, and MSS-7/MSS-5. J 

To complete the training process, we next used unsupervised p 

clustering. Five rectangles of data were selected which appeared to 

contain samples of everything in the scene. The clustering algorithm t 

was applied in two sweeps through the data: first, looking at every % 

fourth line and every fourth point in. all five rectangles, then back 
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again looking at every pixel in all rectangles (26167 pixels looked 
at in total) , An upward limit of 30 clusters had been specified and 
the two passes through the data v?ere done in order to avoid biasing 
the clusters toward the materials in one rectangle. The clustering 
was performed on the eight channels mentioned previously and a multi- 
spectral signature was generated for each cluster. Our main objective 
in rumaing CLUSTR. was to avoid missing any significant categories. 

In addition, clustering often produces signatures which encompass the 
characteristics of a class over a large area better than a few train- 
ing set generated signatures. For the 30 October 1975 data twenty- 
tv;o acceptable clusters were generated. 

Plots of the distribution of the tv-7enty-two cluster derived 
signatures in two channel hyperspace (MSS-5, red and MSS-7, IR) were 
compared with similar plots of the 33 training set. derived sign"’t'''>'es . 
This enabled us to assign names (classes) to the cluster signatures. 
Ellipse plots of the signatures used in the classification of this 
data set are shown in Figure 2. The distribution of each signature 

class is represented as an ellipse whose boundary is a constant pro- 
2 

bability of one X distance from the mean. (In the final CLASFY. pro- 

2 

gram vjhich produced the recognition results a X value of 99.99 was 
used.) 


AH but tw^o of the signatures used in the final classification 
were cluster derived. Signatures for the final classification vjere 
chosen on the basis of what class they represented and their separa- 
bility from other signatures representing other classes. Ellipse 
plots and confusion matrices (similar to Table 1) were used to help 
determine this separability. 

We also investigated the slope/aspect situation; i.e., the 
differences in signal received by the sensor due to differences in 
irradlance. Problems arise wlien the same material lies on areas 



DATA VALUES MSS-5 

FIGURE 2. ELLIPSE PLOTS OF THE SIGNATURES USED IN THE CLASSIFICATION OF THE 30 ‘OCTOBER 1975 
data SET (OBSERVATION 2281-15A65) IN TL’O CHANivEL SPACE (MSS-3 ^VND NSS-7) . The boundary of 
each ellipse represents a constant probability- of one •// distance from the mean. 




Table 1. CONFUSION (EXPECTED-PERFORMANCE) MATRIX BASED ON TOE SEVENTEEN SIGNATURES USED FOR THE 
FINAL CLASSIFICATION OF THE VffiSTEPvN KENTUCKY STUDY SITE. RoxvTs represent distributions 
based on those signatures; columns represent the recognition classes. (Each distribution 
consists of 1000 points per signature taken at random and distributed according to the 
multivariate normal distribution specified by the signature.) Numbers are in percent and 
give the probability that pixels from each signature distribution will be classified into 
each recognition class. Dashes indicate zero percent probability. The classifier used in 
producing the matrix was the best linear rule classifier , that used in the final classification.' 
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0.1 

— 


0.8 

— 

— 

96.8 


0.9 

— 

herbaceous vegetatlon/14 

— 


— 

— 

— 

— 

— 

— - 

— 

0.1 

1.1 

— 

0,1 

4.4 

— 

94.3 

— 

— 

herbaceous vegetation/13 

___ 

— 


__ 

' 

— 

— 

— 

— 



^ 





0.6 

0.7 

0.1 

93.6 
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V7ith significantly diffex-ent slope and aspect such that the irradiance 
on the material > and hence the radiance received by Landsat is 
significantly different. Thus a single signatui-e will not be able 
to recognize the material on all slopes and aspects, and in fact, one 
material vxith a particular slope and aspect may look like another 
material with a different slope and aspect. 

Such problems were encountered in this area. For example, one 
material with a northerlj^-facing slope had a mean level digital count 
of 10.8 in MSS-5 vxhereas the same material on a more southerly-facing 
slope had .a digital count value of 18.1, a 68% change in mean level. 
Such a situation causes grave problems in spectral recognition. 

One method that has been used to ameliorate the effects of vary- 
ing irradiance due to such factors as varying slope and aspect is to 
establish signatures using ratios of 2 spectral channels of digital 
data. The resulting ratios are generally less susceptible to variation 
in slope and aspect because the magnitude of the irradiance changes 
tend to be correlated between the spectral bands. 

■- Due to the significance of the slope-aspect problem in this area 
we investigated the utility of ratios. For the same area for vjhich a 
single red band (MSS-5) varied 68%, an MSS-7/MSS-5 ratio varied by 
34%. This is not complete normalization, but it is obviously an 
improvement. 

Unfortunately, ratioing of channels frequently causes a loss of 
information, content and sometimes causes a loss of ability to dis- 
criminate between materials that are differentiable using individual 
chamxels of data. Under the circumstances, therefore, we felt it 
was best to do our classification using signatures derived from both 
individual channels and ratios of channels. The hope was that the 
resulting classification would embody some of the beneficial aspects 
of both approaches. - • 
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One of the major goals of this project was to map the location of 
reclaimed, regraded and re-vegetated strip mine areas where the 
vegetation cover had reached greater than 70%. Previous experience 
has indicated that an infrared/red reflectance ratio is highly 
effective at discriminating between various classes of green vegeta- 
tion Cover, generally better than ocular estimates made by observers 
.on the ground. In addition, an IR/red reflectance ratio possesses 
the advantages of ratios discussed earlier. Furthermore, an IR/red 
reflectance ratio has been found to be a good normalizer of re- 
flectance differences between different soil types and surface soil 
moistures. Therefore, we decided to use an MSS-7/MSS-5 ratio to 
differentiate b e tn^een lovj (<70%) and high (>70%) green vegetation cover 
Ground photos and field notes as well as the aerial photos v;ere 
used in deciding on appropriate luvelb for slicing the HSS-7/MSS-5 
ratio. The field notes, classified vegetation as being greater or 
less than 50% cover, and apparently referred to total (live and dead) 
vegetation cover. In addition, we have found ocular estimates of 
vegetation cover to be rather consistently too high. Therefore, our 
decision on an appropriate MSS-7/MSS-5 ratio slicing level was 
heavily dependent on the color IR aerial photos. Although it is 
rather difficult to estimate percent green vegetation cover on this 

N 

scale of aerial photography, it did afford the advantage of a truly 
vertical perspective (for which % cover is defined), a synoptic view, 
and high sensitivity to amount of green vegetation xdiich is a char- 
acteristic of color IR film, hhen we had picked training sets and 
computed their MSS-7/MSS-5 ratios, we compared them with the cluster 
signatures. The ratio we had picked to separate >70% green vegeta- 
tion from <70% green vegetation (R=1.5) fell between the ratio values 
for two large clusters. This v,’^as a fortuitous, but beneficial result. 

A .lower limit was also selected for the ratio so that ateas with 
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negligible green vegetation cover (<10%) T-rould be placed in the bare soil 
category. 

VJlien ratio values for perceiitage cover had been selected the data from 
the test area were classified using 17 signatures and a level slice of 
liSS-7 for water recognition. For points classified as herbaceous vegetation 
an estimate of the percentage cover was made using the ratio slicing 
previously discussed. Results were then displayed as vegetation with 0-10, 

t I 

10-70, and>®70/o cover rather than as Individual vegetation classes. 

A color coded map (Appendix D) v^ras prepared using a computer line printer 
and various colored printing ribbons. At the same time area statistics were 
developed. These 'statistics, showt in Table 2, are the acreages of the various 
classes in the test area, as recognized by the computer. For reference, the 
percentage composition of the area is also shown. 

The thirteen classes of the filial recognition map for the 30 October 1975 
data were obtained from the 17 training sets by combining the two water classes 
into one symbol for displaj’^, by telescoping the five herbaceous vegetation 
classes into 3 cover classes (as previously discussed), and by combining the 
0-10% cover class with the graded bare soil class (Table 3). 

As an aid to interpreting the results of classification and to understanding 
how areas might be misclassified, a confusion matrix was generated using the 
17 final signatures and samples of data dr aim from these assumed Gaussian 
signatures. The samples of data were classified according to the decision 
rule used in the classification program. The results, presented as Table 1, 
are not precisely indicative of the accuracy and performance of the classifier 
over a large area (because only data from training sets is examined), but do 
offer some guidance about probable kinds of errors. In Table 1 the percentage 
of j)lnts in a signature class (each xo\<r represents one signature class) classified 
as a given signature (each column represents points classified as a given 
signature) is ijresented. 
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STATISTICS FOR THE THIRTEEN LAND USE CLASSES OBTAINED 
FROM RECOGNITION PROCESSING OF THE UTISTERN KENTUCKY 
STUDY SITE. Data fro:a Landsat-2 observation 2281- 
15A65 obtained 30 October 1975. 




PERCENT 

CLASS 

ACREAGE 

OF TOTAL AREA 

water 

1809.50 

1.49 

marsh 

1930.50 

1.59 


■; lowland forest 

13103.45 

10.80 

i. ^ kob 

584. 87 

.48 

slurry 

351.82 

.29 

upland forest 

18829.97 

15.52 

conifers 

1247.04 

1.03 

orphan lands 

36077.90 

29.75 

■ ■■■■ 

; bare soil (ungraded) 

4229.63 

3.49 

1 ■ ; ' 

bare 'soil (graded) 

10377.44 

8.56 

>70% green herbaceous cover; 

3780.34 

3.12 

i V probably agriculture 



10-70% green herbaceous cover 

18269.76 

15.06 

>70% green herbaceous cover 

10693.40 

8.82 


121285.62 

100.0 




- 100 - 


FORMERUY V/IU.OW RUN l,AnOF<ATOJ!ltS, THE UNlVER&nY OF MICHIGAN 


Table 3. 

THE THIRTEEN CLASSES OF THE FINAL RECOGNITION MAP OF THE 
WESTERN KENTUCKY STUDY SITE AND THEIR DERIVATION. Data was 
from Landsat-2 observation 2281-15465 obtained 30 October 


1975. Numbers and names refer to 
(See Figure 2 and Table 1.) 

CLASS 

water 

"marsh" 

(may bo shrub sv;amp or 
some oLUcc wetland type) 

gob 

s lurry 

lowland forest 

upland forest 

conifers 

orphan land 

bare soil (ungraded) 

bare soil (graded) 

>70% green herbaceous 
cover; probabl.y 
agriculture 

10-70% green, herbaceous 
cover 

>70% green herbaceous 
covet 


specific signatures. 


DERIVATION 

level slice of MSS-7; this 
included all data points 
classified under sig. no. 
21 and 9 and some points 
classified under sig. no. 
8, G0B» SLURRY and LOWLAND 

8 

GOB 

13 

LOWLAND 

16 

10 

2 

11 

12 (and MSS-7/MSS-5 level 
slice of sig. no. 25, 4, 
6, 14, 18) 


MSS-7/MSS-5 level 

slice of sig. no. 25, 4, 
6, 14, 18 

MSS~7/MSS-5 level slice of 
sig. no, 25, 4, 6, 14, 18 
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Figure 2 

High altitude aerial, color infra-red photograph of the test area 
east of Madisonville, Kentucky. Note "anchor" lake in lower center. 
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Flfura i 

(Araa 109 of Fig. 4) 
Strip ainad and raqradad 
to low rolling hilla. 
Fascua and clovar with 
mixed hardwood and pinw 
aaplinqa. Laaa than 50% 
ground covar. A- is an 
oblique view, and 9- is 
a vertical vlaw of the 
surfaoa. 




i 
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fi^ur* 6 

(Area 110 of Fig. 4) 

Strip mined and regraded 
to rolling pastura. Fes- 
cue and alfalfa predominate. 
Nearly 100% ground cover 
with a few bare spots. A- 
is an obliqxie, and B- is a 
vertical view of the surface. 
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Figure 7 

(Afm 113 of Fig. 4) 
Stripped <*nd atrlka-off 
graded to long flat-topped 
rldgea. Mixed acrub hard- 
wooda. 


Figure S 

(Areaa 113 and 114 of Fig. 4) 
Unmlned foraated area weat 
of "anchor" lake. Mixed lo%»- 
land hardwooda with acattered 
ahruba and leaf litter floor. 
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Figure 9 

(Area 115 of Fig. 4) 
Umxined agricultural area 
aouthweat of "anchor" lake 
Uni f orally brovn aoybean 
field. 


Figure 10 

(Aiea 116 of Fig. 4) 

Old slurry pond south of 
"anchor" lake. Fine-grained 
coal reiuse froa a coal 
washing facility. 
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Figure 13 

Color-coded recognition up of test area. Note 
upper right area. (See color legend in Fig. 14) . 
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Figure 14 

Color legend for recognition map of Fig. 13. Number refers 
number of pixels in each category within the test strip. 






